✦ LIBER ✦

Nearest-neighbor classification with categorical variables

✍ Scribed by Samuel E. Buttrey

Publisher: Elsevier Science
Year: 1998
Tongue: English
Weight: 736 KB
Volume: 28
Category: Article
ISSN: 0167-9473
DOI: 10.1016/s0167-9473(98)00032-2

No coin nor oath required. For personal study only.

✦ Synopsis

A technique is presented for adopting nearest-neighbor classification to the case of categorical variables. The set of categories is mapped onto the real line in such a way as to maximize the ratio of total sum of squares to within-class sum of squares, aggregated over classes. The resulting real values then replace the categories, and nearest-neighbor classification proceeds with the Euclidean metric on these new values. Continuous variables can be included in this scheme with little added efort. This approach has been implemented in a computer program and tried on a number of data sets, with encouraging results.

Nearest-neighbor classification is a well-known and efective classification technique. With this scheme, an unknown item's distances to all known items are measured, and the unknown class is estimated by the class of the nearest neighbor or by the class most often represented among a set of nearest neighbors. This has proven effective in many examples, but an appropriate distance normalization is required when variables are scaled differently. For categorical variables "distance" is not even defined. In this paper categorical data values are replaced by real numbers in an optimal way: then those real numbers are used in nearest-neighbor classification. (~) 1998 Elsevier Science B.V. All rights reserved.

📜 SIMILAR VOLUMES

Improved k-nearest neighbor classificati

Improved k-nearest neighbor classification

✍ Yingquan Wu; Krassimir Ianakiev; Venu Govindaraju 📂 Article 📅 2002 🏛 Elsevier Science 🌐 English ⚖ 113 KB

k-nearest neighbor (k-NN) classiÿcation is a well-known decision rule that is widely used in pattern classiÿcation. However, the traditional implementation of this method is computationally expensive. In this paper we develop two e ective techniques, namely, template condensing and preprocessing, to

Nearest Neighbor Classification Rule for

Nearest Neighbor Classification Rule for Mixtures of Discrete and Continuous Random Variables

✍ T. J. Wojciechowski 📂 Article 📅 1987 🏛 John Wiley and Sons 🌐 English ⚖ 297 KB 👁 2 views

In this paper very simple nonparametric c l d i c a t i o n rule for mixtures of discrete and oonhuons random variables is described. It ie based on the method of neatest neighbor proposed by COVEB and HABT (1967). The bounds on the limit of the near& neighbor ruleriske are given. Both lower and upp

Prototype optimization for nearest-neigh

Prototype optimization for nearest-neighbor classification

✍ Y.S. Huang; C.C. Chiang; J.W. Shieh; E. Grimson 📂 Article 📅 2002 🏛 Elsevier Science 🌐 English ⚖ 238 KB

A novel neuralnet-based method of constructing optimized prototypes for nearest-neighbor classiÿers is proposed. Based on an e ective classiÿcation oriented error function containing class classiÿcation and class separation components, the corresponding prototype and feature weight update rules are

Locally nearest neighbor classifiers for

Locally nearest neighbor classifiers for pattern classification

✍ Wenming Zheng; Li Zhao; Cairong Zou 📂 Article 📅 2004 🏛 Elsevier Science 🌐 English ⚖ 120 KB

Boosted discriminant projections for nea

Boosted discriminant projections for nearest neighbor classification

✍ David Masip; Jordi Vitrià 📂 Article 📅 2006 🏛 Elsevier Science 🌐 English ⚖ 388 KB

Variable scaling and nearest neighbor me

Variable scaling and nearest neighbor methods

✍ Jan Gertheiss; Gerhard Tutz 📂 Article 📅 2009 🏛 John Wiley and Sons 🌐 English ⚖ 93 KB