𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Soft vector quantization and the EM algorithm

✍ Scribed by Ethem Alpaydın


Publisher
Elsevier Science
Year
1998
Tongue
English
Weight
224 KB
Volume
11
Category
Article
ISSN
0893-6080

No coin nor oath required. For personal study only.

✦ Synopsis


The relation between hard c-means (HCM), fuzzy c-means (FCM), fuzzy learning vector quantization (FLVQ), soft competition scheme (SCS) of Yair et al. (1992) and probabilistic Gaussian mixtures (GM) have been pointed out recently by Bezdek and Pal (1995). We extend this relation to their training, showing that learning rules by these models to estimate the cluster centers can be seen as approximations to the expectation-maximization (EM) method as applied to Gaussian mixtures. HCM and unsupervised, LVQ use 1-of-c type competition. In FCM and FLVQ, membership is the ¹2/(m ¹ 1)th power of the distance. In SCS and GM, Gaussian function is used. If the Gaussian membership function is used, the weighted within-groups sum of squared errors used as the fuzzy objective function corresponds to the maximum likelihood estimate in Gaussian mixtures with equal priors and covariances. The fuzzy clustering method named fuzzy c-means alternating optimization procedure (FCM-AO) proposed to optimize the former is then equivalent to batch EM and SCS's update rule is a variant of the online version of EM. The advantages of the probabilistic framework are: (i) we no longer have spurious spread parameters that needs fine tuning as m in fuzzy vector quantization or b in SCS; instead we have a variance term that has a sound interpretation and that can be estimated from the sample; (ii) EM guarantees that the likelihood does not decrease, thus it converges to the nearest local optimum; (iii) EM also allows us to estimate the underlying distance norm and the cluster priors which we could not with the other approaches. We compare Gaussian mixtures trained with EM with LVQ (HCM), SCS and FLVQ on the IRIS dataset and see that it is more accurate due to its being able to take into account the covariance information. We finally note that vector quantization is generally an intermediate step before finding a final output for which supervision may be possible. Thus, instead of an uncoupled approach where an unsupervised method is used first to find the cluster parameters followed by supervised training of the mapping based on the memberships, we advocate a coupled approach where the cluster parameters and mapping are trained supervised in a coupled way. The uncoupled approach ignores the error at the outputs which may not be ideal.


📜 SIMILAR VOLUMES


On the fast search algorithms for vector
✍ Wen-Shiung Chen; Lili Hsieh; Shang-Yuan Yuan 📂 Article 📅 2002 🏛 John Wiley and Sons 🌐 English ⚖ 128 KB

## Abstract One of the major difficulties arising in vector quantization (VQ) is high encoding time complexity. Based on the well‐known partial distance search (PDS) method and a special order of codewords in VQ codebook, two simple and efficient methods are introduced in fast full search vector qu

The learning vector quantization algorit
✍ M.T. Martín-Valdivia; L.A. Ureña-López; M. García-Vega 📂 Article 📅 2007 🏛 Elsevier Science 🌐 English ⚖ 998 KB

Automatic text classification is an important task for many natural language processing applications. This paper presents a neural approach to develop a text classifier based on the Learning Vector Quantization (LVQ) algorithm. The LVQ model is a classification method that uses a competitive supervi

Two-dimensional split and merge algorith
✍ Wai-Fong Lee; Chok-Ki Chan 📂 Article 📅 1998 🏛 Elsevier Science 🌐 English ⚖ 936 KB

In this paper, an image is modelled as a two and a half-dimensional surface and an approximated surface for the image is formed by triangular patches. A new two-dimensional split and merge algorithm (2DSM) for generating the approximated surface has been devised. The algorithm iteratively improves t