The effect of structural redundancy in validation sets on virtual screening performance
β Scribed by Robert D. Clark; Jennifer K. Shepphird; John Holliday
- Publisher
- John Wiley and Sons
- Year
- 2009
- Tongue
- English
- Weight
- 460 KB
- Volume
- 23
- Category
- Article
- ISSN
- 0886-9383
- DOI
- 10.1002/cem.1240
No coin nor oath required. For personal study only.
β¦ Synopsis
Abstract
The performance of a classification model is often assessed in terms of how well it separates a set of known observations into appropriate classes. If the validation sets used for such analyses are redundant due to bias in sampling, the relevance of the conclusions drawn to prospective work in which new kinds of positives are sought may be compromised. In the case of the various virtual screening techniques used in modern drug discovery, such bias generally appears as overβrepresentation of particular structural subclasses in the test set. We show how clustering by substructural similarity, followed by applying arithmetic and harmonic weighting schemes to receiver operating characteristic (ROC) curves, can be used to identify validation sets that are biased due to such redundancies. This can be accomplished qualitatively by direct examination or quantitatively by comparing the areas under the respective linear or semilog curves (AUCs or pAUCs). Copyright Β© 2009 John Wiley & Sons, Ltd.
π SIMILAR VOLUMES
## Abstract An important area in ComputerβMediated Communication (CMC) is how gender affects trust building and performance in virtual settings. This paper empirically investigates gender differences in two media, video and Instant Messaging, while performing negotiation tasks. The primary results
## Abstract Multifactor Dimensionality Reduction (MDR) was developed to detect genetic polymorphisms that present an increased risk of disease. Crossβvalidation (CV) is an important part of the MDR algorithm, as it prevents overβfitting and allows the predictive ability of a model to be evaluated.
## Abstract Numerous studies conclude that the selective adsorption of plasma proteins on materials contacting blood or tissue affects all subsequent interactions related to the biocompatibility of artificial surfaces. However, there are only a few studies available, which clearly demonstrate that
We aimed to determine how differences in the age at which women had their pregnancies influenced the expected detection and false-positive rates of serum screening for Down 's syndrome (i) between 1970 and 1993 in England and Wales, and (ii) between regions and districts of England and Wales in 1991