Contour approximation of data: A duality theory
โ Scribed by Cem Iyigun; Adi Ben-Israel
- Publisher
- Elsevier Science
- Year
- 2009
- Tongue
- English
- Weight
- 393 KB
- Volume
- 430
- Category
- Article
- ISSN
- 0024-3795
No coin nor oath required. For personal study only.
โฆ Synopsis
Given a dataset D partitioned in clusters, the joint distance function (JDF) J(x) at any point x is the harmonic mean of the distances between x and the cluster centers. The JDF is a continuous function, capturing the data points in its lower level sets (a property called contour approximation), and is a useful concept in probabilistic clustering and data analysis. In particular, contour approximation allows a compact representation of the data: for a dataset in R n with N points organized in K clusters, the JDF requires K centers and covariances (if Mahalanobis distances are used), for a total of Kn(n + 3)/2 parameters, and a considerable reduction of storage if N K, n. The JDF of the whole dataset, J(D):= {J(x) : x โ D}, is a measure of the classifiability of the data, and can be used to determine the "right" number of clusters for D. A duality theory for the JDF J(D) is given, in analogy with Kuhn's geometric duality theory for the Fermat-Weber location problem. The JDF J(D) is the optimal value of a primal problem (P), for which a dual problem (D) is given, with a sharp lower bound on J(D).
๐ SIMILAR VOLUMES
Let an m X n matrix A be approximated by a rank-r matrix with an accuracy E. We prove that it is possible to choose r columns and r rows of A formin a so-called pseudoskeleton component which approximates A with B<&<& + $ n )) accuracy in the sense of the e-norm. On the way to this estimate we study