Selection of individual variables versus intervals of variables in PLSR
✍ Scribed by Masoud Shariati-Rad; Masoumeh Hasani
- Publisher
- John Wiley and Sons
- Year
- 2010
- Tongue
- English
- Weight
- 570 KB
- Volume
- 24
- Category
- Article
- ISSN
- 0886-9383
- DOI
- 10.1002/cem.1266
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
The selection abilities of the two well‐known techniques of variable selection, synergy interval‐partial least‐squares (SiPLS) and genetic algorithm‐partial least‐squares (GA‐PLS), have been examined and compared. By using different simulated and real (corn and metabolite) datasets, keeping in view the spectral overlapping of the components, the influence of the selection of either intervals of variables or individual variables on the prediction performances was examined. In the simulated datasets, with decrease in the overlapping of the spectra of components and cases with components of narrow bands, GA‐PLS results were better. In contrast, the performance of SiPLS was higher for data of intermediate overlapping. For mixtures of high overlapping analytes, GA‐PLS showed slightly better performance. However, significant differences between the results of the two selection methods were not observed in most of the cases. Although SiPLS resulted in slightly better performance of prediction in the case of corn dataset except for the prediction of the moisture content, the improvement obtained by SiPLS compared with that by GA‐PLS was not significant. For real data of less overlapped components (metabolite dataset), GA‐PLS that tends to select far fewer variables did not give significantly better root mean square error of cross‐validation (RMSECV), cross‐validated R^2^ (Q^2^), and root mean square error of prediction (RMSEP) compared with SiPLS. Irrespective of the type of dataset, GA‐PLS resulted in models with fewer latent variables (LVs). When comparing the computational time of the methods, GA‐PLS is considered superior to SiPLS. Copyright © 2010 John Wiley & Sons, Ltd.
📜 SIMILAR VOLUMES
A robust method of selecting variables with the greatest discriminatory power is presented in the paper. It is based on the robustified Wilks A statistic and can be applied in a multi-group discrimination problem. An application to some respiratory disease data together with a comparison of the clas
The main objective of this work was to develop a way to select a subset of analytical variables from one initial broad study to monitor groundwater quality. The study was applied to one Spanish area. The work was focused on selecting actual physico-chemical variables rather than using typical multiv
A new method for the choice of variables with the greatest discriminatory power in the location model for mixed variable discriminant analpis is presented in the paper. The procedure based on ' the multivariate discriminatory measure enables a simultaneous reduction of the number of discrete and con