A new strategy of outlier detection for QSAR/QSPR
✍ Scribed by Dong-Sheng Cao; Yi-Zeng Liang; Qing-Song Xu; Hong-Dong Li; Xian Chen
- Publisher
- John Wiley and Sons
- Year
- 2009
- Tongue
- English
- Weight
- 420 KB
- Volume
- 31
- Category
- Article
- ISSN
- 0192-8651
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
The crucial step of building a high performance QSAR/QSPR model is the detection of outliers in the model. Detecting outliers in a multivariate point cloud is not trivial, especially when several outliers coexist in the model. The classical identification methods do not always identify them, because they are based on the sample mean and covariance matrix influenced by the outliers. Moreover, existing methods only lay stress on some type of outliers but not all the outliers. To avoid these problems and detect all kinds of outliers simultaneously, we provide a new strategy based on Monte‐Carlo cross‐validation, which was termed as the MC method. The MC method inherently provides a feasible way to detect different kinds of outliers by establishment of many cross‐predictive models. With the help of the distribution of predictive residuals such obtained, it seems to be able to reduce the risk caused by the masking effect. In addition, a new display is proposed, in which the absolute values of mean value of predictive residuals are plotted versus standard deviations of predictive residuals. The plot divides the data into normal samples, y direction outliers and X direction outliers. Several examples are used to demonstrate the detection ability of MC method through the comparison of different diagnostic methods. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010
📜 SIMILAR VOLUMES
## Abstract ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 100 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable v
## Abstract Quantitative structure‐activity relationship (QSAR) of a series of structural diverse malonyl‐CoA decarboxylase (MCD) inhibitors have been investigated by using the predictive single model as well as the consensus analysis based on a new strategy proposed by us. Self‐organizing map (SOM