𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Book review: Data Analysis for Chemists: Applications to QSAR and Chemical Products Design. David Livingstone. Oxford University Press, Oxford, 1995, ISBN 0 19 855728 0, xvi+239 pp., £40

✍ Scribed by Rolf Sundberg


Publisher
John Wiley and Sons
Year
1998
Tongue
English
Weight
25 KB
Volume
12
Category
Article
ISSN
0886-9383

No coin nor oath required. For personal study only.

✦ Synopsis


The cover of this book gives only the broad title 'Data Analysis for Chemists'. The subtitle found inside the book tells more about its flavour. An even more fitting title would be 'Talking about Statistical Methodology Used in QSAR Studies'. In the author's own words the book is intended to be a practical guide for chemists to 'the apparently esoteric methods of multivariate statistics, otherwise known as pattern recognition or chemometrics'. The reader might be startled by this characterization of multivariate statistics, but it reflects the jargon of the author. In section 1.5 he frankly states that 'Chemometrics and multivariate analysis refer to more or less the same things' and that 'Pattern recognition and chemometrics are more or less synonymous'. This opinion is also to some extent reflected in the contents, and more so in chapter titles, which use terms like unsupervised and supervised learning and artificial intelligence.

Chapter 1 gives a nice introduction to QSAR/ QSPR, that is the study of the quantitative relationship between biological activity or other property and chemical structure. It contains some bits of the historical development of the subject, and fact boxes explaining several common molecular descriptors, such as the electronic and the hydrophobic substituent constants (' and p).

The second chapter discusses experimental design, but like most of the book only in a sketchy way, and completely without data analysis. Designs mentioned include Latin and Graeco-Latin squares, full factorials and fractional two-level factorials, and D-optimal designs. A final section describes strategies for compound selection. Terminology becomes sometimes quite confusing when the author tries to let the term treatment also cover the undefined concept experimental unit. One example of a fractional factorial is given, a 2 5 À 2 , with its aliasing partially written down but unfortunately not quite correctly.

Chapter 3 (Data pre-treatment) discusses data distributions, (auto)scaling and what is called data reduction. The latter is defined by the author as selected removal of variables from data. This is motivated 'since multicollinearity is a condition


📜 SIMILAR VOLUMES