๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

The improvement of SIMCA classification by using kernel density estimation : Part 2. Practical evaluation of SIMCA, ALLOC and CLASSY on three data sets

โœ Scribed by Hilko vander Voet; Durk A. Doornbos


Publisher
Elsevier Science
Year
1984
Tongue
English
Weight
664 KB
Volume
161
Category
Article
ISSN
0003-2670

No coin nor oath required. For personal study only.

โœฆ Synopsis


The performance of the new probabilistic classification method CLASSY is evaluated on three different data sets, together with its predecessors SIMCA and ALLOC. The improvement made over ALLOC is only marginal, whereas CLASSY shows better predictive ability and greater reliability than SIMCA in most cases.

The evaluation of pattern recognition techniques was considered theoretically in Part 1 of this series [l] . The present paper is concerned with what information is provided by the selected measures for predictive ability [the number of errors, NE, and the quadratic score (QJ1), sharpness (QZ)] and reliability (Q5) about the SIMCA method (made probabilistic as described in Part 1) and the ALLOC and CLASSY methods. Because the optimal dimensionality for the principal component (PC) class models in SIMCA and CLASSY was unknown, and because the same optimum was not expected for both methods, all possible values of class dimensionality A were examined systematically.

DATA AND COMPUTER PROGRAMS

The pattern recognition methods SIMCA, ALLOC and CLASSY were evaluated on three data sets. Data sets Iris data. The well known iris data from Fisher have been analysed by several authors [2-41.

The data set consists of measurements made on flowers from three species of iris: Iris setosa, Iris uersicolor and Iris uirginica. Iris setosa was very easily distinguished from the other two by all methods, so only the latter two species were used here. There are four variables: sepal length, sepal width, petal length and petal width. Each class contains 50 individuals, which were divided randomly in two groups, a training and a test


๐Ÿ“œ SIMILAR VOLUMES


The improvement of SIMCA classification
โœ Hilko van der Voet; Durk A. Doornbos ๐Ÿ“‚ Article ๐Ÿ“… 1984 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 661 KB

One of the disadvantages of SIMCA pattern recognition is its inability to produce probabilistic classifications. Attempts to correct this involve distributional assumptions. It appears that SIMCA can handle the residual error terms efficiently, but that inside the class model subspace a crude trunca