Clustering algorithms typically use the Euclidean distance. However, spatial proximity is dependent on obstacles, caused by related information in other layers of the spatial database. We present a clustering algorithm suitable for large spatial databases with obstacles. The algorithm is free of use
Cluster tests for geographical areas with binary data
β Scribed by Friedrich Gebhardt
- Publisher
- Elsevier Science
- Year
- 1999
- Tongue
- English
- Weight
- 436 KB
- Volume
- 31
- Category
- Article
- ISSN
- 0167-9473
No coin nor oath required. For personal study only.
β¦ Synopsis
Consider an area subdivided into non-overlapping districts, e.g. a state divided into counties, and assume that some districts are marked for having some distinguishing property. Then the question arises whether the marked districts are distributed randomly or exhibit some spatial clustering. This question is pursued here to some extent theoretically, in particular using regular tesselations of the plane (hexagons in a honeycomb), and to some extent by simulations using these regular constellations as well as real situations, in particular 171 counties in six Bundesl ander of Germany. Connectivity regions (maximal regions of marked districts) turn out to be a bad choice in general. Instead, clusters are formed from triplets of marked districts. Approximate statistical tests are developed. They are simple enough to be used for data mining where potentially a large number of tests has to be performed.
π SIMILAR VOLUMES
This paper proposes an independence test for a set of clustered binary observations, such as might be encountered in developmental toxicity studies. An exchangeable binary model is employed, under an assumption of exchangeability among cluster elements, to model probability of positive response. Wit
This paper attempts to demonstrate the utility of cluster analysis as a descriptive method of studying mortality in epidemiology. In order to verify which algorithms of clustering best fit the data structure, the method of cophenetic correlation was implemented. Furthermore the probabilistic algorit