๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

The peaking phenomenon in the presence of feature-selection

โœ Scribed by Chao Sima; Edward R. Dougherty


Publisher
Elsevier Science
Year
2008
Tongue
English
Weight
332 KB
Volume
29
Category
Article
ISSN
0167-8655

No coin nor oath required. For personal study only.

โœฆ Synopsis


For a fixed sample size, a common phenomenon is that the error of a designed classifier decreases and then increases as the number of features grows. This peaking phenomenon has been recognized for forty years and depends on the classification rule and feature-label distribution. Historically, the peaking phenomenon has been treated by assuming a fixed ordering of the features, usually beginning with the strongest individual feature and proceeding with features of decreasing individual classification capability. This does not take into account feature-selection, which is commonplace in high-dimensional and small sample settings. This paper revisits the peaking phenomenon in the presence of feature-selection. Using massive simulation in a high-performance computing environment, the paper considers various combinations of feature-label models, feature-selection algorithms, and classifier models to produce a large library of error versus feature size curves. Owing to the prevalence of feature-selection in genomic classification, we also consider gene-expression-based classification of breast-cancer patient prognosis. Results vary widely and are strongly dependent on the combination. The error curves tend to fall into three categories: peaking, settling into a plateau, or falling very slowly over a long range of feature set sizes. It can be concluded that one should be wary of applying peaking results found in the absence of feature-selection to settings in which feature-selection is employed.


๐Ÿ“œ SIMILAR VOLUMES


The role of eigenvalues in linear featur
โœ D.R. Brown; M.J. O'Malley ๐Ÿ“‚ Article ๐Ÿ“… 1977 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 523 KB

A family of examples is constructed to show that if B is the k x n matrix (IklZ)U, where U is an n x n orthogonal matrix, then the eigenvalues of U do not affect the value of divergence D B in the space of reduced dimension.