𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Iterative Shannon Entropy – a Methodology to Quantify the Information Content of Value Range Dependent Data Distributions. Application to Descriptor and Compound Selectivity Profiling

✍ Scribed by Anne Mai Wassermann; Martin Vogt; Jürgen Bajorath


Publisher
Wiley (John Wiley & Sons)
Year
2010
Tongue
English
Weight
558 KB
Volume
29
Category
Article
ISSN
1868-1743

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

We introduce an entropy‐based methodology, Iterative Shannon entropy (ISE), to quantify the information contained in molecular descriptors and compound selectivity data sets taking data spread directly into account. The method is applicable to determine the information content of any value range dependent data distribution. An analysis of descriptor information content has been carried out to explore alternative binning schemes for entropy calculation. Using this entropic measure we have profiled 153 compound selectivity data sets for combinations of 68 target proteins belonging to 10 target families. With the ISE measure, we aim to assign high information content to compound data sets that span a wide range of selectivity values and different selectivity relationships and hence correspond to more than one biological phenotype. Target families with high average entropy scores are identified. For members of these families, active compounds display highly differentiated selectivity profiles.