𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Book Review: Wiley Series in Probability and Statistics, 2003. By Peter J. Rousseeuw and Annick M. Leroy

✍ Scribed by Christine Müller


Publisher
John Wiley and Sons
Year
2006
Tongue
English
Weight
34 KB
Volume
48
Category
Article
ISSN
0323-3847

No coin nor oath required. For personal study only.

✦ Synopsis


The book deals with the problem of fitting a line or hyperplane to data in the presence of outliers. Outliers are a severe problem in many practical applications. This holds in particular for regression methods since the classical method for regression, the least squares method, is very sensitive with respect to outliers. Often outliers are dropped from the data per hand which causes an uncontrolled bias of the data and which is mainly possible for fitting a line to a two-dimensional data set. As soon as there are many exploratory variables, the detection of outliers becomes a very difficult task. For higher dimensions, it was supposed to regard the residuals; but due to a masking effect of the least squares estimators, the largest residuals may not belong to outliers. Since the time that outliers became a topic in statistics, people were searching for outlier robust methods in regression. Some first approaches, as the extension of the outlier robust median to the l 1 -estimator and other M-estimator for regression, were not very successful since outliers at leverage points can destroy the outlier robustness. The first real outlier robust methods for regression were the least median of squares estimator (LMS), the least trimmed squares estimators (LTS) and other estimators defined by minimizing an outlier robust measure for scatter of the residuals (S-estimators).

These methods and other first proposals for outlier robust regression are treated in detail in this book. The different methods are compared with respect to their outlier robustness and it is shown that good outlier robust methods can be used for outlier detection since they do not suffer from the masking effect. Some examples also demonstrate that outliers can contain important information so that the outliers are not only disturbing the data analysis. Their knowledge is also valuable for a good interpretation of the data.

The book is a reprint of a book with the same title from 1987. It is rather disappointing that nothing new was included in the new edition. Even the references stop with the year 1987. Since 1987, a huge amount of new methods in robust regression has been developed. However, there are so many methods for robust regression nowadays that it is probably impossible to write a book which catches all methods. In this sense, it may be a good idea to keep the book as it is.

The first edition of this book was somehow the starting point of the impressive development in robust regression. It had impact not only in regression but also in other areas of statistics. In particular the idea of the least trimmed squares estimator of minimizing a trimmed scatter measure is meanwhile transfered to many other areas like generalized linear models and multivariate estimation of location and scatter. Therefore the book was and still is cited in many articles, and it can be assumed that the new edition will be similarly successful.

The book is appropriate for an introductory course in robust regression and outlier detection. After an introduction where the breakdown point is given as robustness measure and where several robust methods are introduced, it discusses the methods for simple linear regression in Section 2 and for multiple regression in Section 3. The one-dimensional location is treated as special case in Section 4. In Section 5, algorithms for calculating the estimators are presented, and Section 6 shows how the methods can be used for outlier diagnostics. Section 7 provides related statistical techniques like robust estimation for multivariate data and robust time series analysis. In particular some of the first robust methods for robust estimation of multivariate location and covariance matrices like Tukey's half space median and the minimum volume ellipsoid are discussed there. All methods of the book are explained by many examples from biology, medicine, education, economy and other areas. The properties of the methods are formulated in theorems with detailed proofs. Therefore the book is appropriate for mathematicians as well as for people working in applications. Since exercises are given at the end of each section, the book is in particular appropriate as a text book for students. However, a strong disadvantage is that the provided program codes are old fashioned. They are given by the special program PROGRESS, but nowadays the methods are mainly included in R and S-PLUS. Hints under which names they can be found there would be more helpful.


📜 SIMILAR VOLUMES


Book Review: Discrete Distributions. App
✍ Dietrich Stoyan 📂 Article 📅 2007 🏛 John Wiley and Sons 🌐 English ⚖ 30 KB 👁 1 views

The author known for his excellent work on statistics for discrete data presents various probability models which lead to interesting discrete distributions, which are very useful for the health sciences. These models are natural generalizations of classical models in probability theory leading to b