Anomalies are cited in the use of subset regression diagnostics. Two leverage diagnostics are uniΓΏed through canonical leverages. Other diagnostics relate one-to-one with the R-Fisher statistic, supporting exact but equivalent tests at level . Three distance diagnostics exhibit ad hoc scalings; thei
Computation of determinantal subset influence in regression
β Scribed by Bruce E. Barrett; J. Brian Gray
- Publisher
- Springer US
- Year
- 1996
- Tongue
- English
- Weight
- 681 KB
- Volume
- 6
- Category
- Article
- ISSN
- 0960-3174
No coin nor oath required. For personal study only.
β¦ Synopsis
One of the important goals of regression diagnostics is the detection of cases or groups of cases which have an inordinate impact on the regression results. Such observations are generally described as 'influential'. A number of influence measures have been proposed, each focusing on a different aspect of the regression. For single cases, these measures are relatively simple and inexpensive to calculate. However, the detection of multiple-case or joint influence is more difficult on two counts. First, calculation of influence for a single subset is more involved than for an individual case, and second, the sheer number of subsets of cases makes the computation overwhelming for all but the smallest data sets.
Barrett and Gray (1992) described methods for efficiently examining subset influence for those measures that can be expressed as the trace of a product of positive semidefinite (psd) matrices. There are, however, other popular measures that do not take this form, but rather are expressible as the ratio of determinants of psd matrices. This article focuses on reducing the computation for the determinantal ratio measures by making use of upper and lower bounds on the influence to limit the number of subsets for which the actual influence must be explicitly determined.
π SIMILAR VOLUMES
A number of robust and diagnostic techniques for the linear regression are in terms of p-dimensional subsets of the original sample. In a sample of size n this leads to consider ' ~ Cp subsets. To reduce the computational burden, a limited number of subsets can be selected by means of a sub-sampling