A clustering procedure based on the comparison between the k nearest neighbors graph and the minimal spanning tree
✍ Scribed by José Marı́a González-Barrios; Adolfo J. Quiroz
- Publisher
- Elsevier Science
- Year
- 2003
- Tongue
- English
- Weight
- 250 KB
- Volume
- 62
- Category
- Article
- ISSN
- 0167-7152
No coin nor oath required. For personal study only.
✦ Synopsis
We present a procedure for the identiÿcation of clusters in multivariate data sets, based on the comparison between the k nearest neighbors graph, G k , and the minimal spanning tree, MST. Our key statistic is the random quantity k := the smallest k such that G k contains the MST. Under regularity assumptions, we show that for i.i.d. data from a density on R d with compact support having one connected component, k =O Pr (log n), where n denotes sample size, a bound that seems to be sharp, according to simulations. This leads to a consistent test for the identiÿcation of crisp clusters. We illustrate the use of our procedure with an example.
📜 SIMILAR VOLUMES