Bias due to two-stage residual-outcome regression analysis in genetic association studies
✍ Scribed by Serkalem Demissie; L. Adrienne Cupples
- Publisher
- John Wiley and Sons
- Year
- 2011
- Tongue
- English
- Weight
- 90 KB
- Volume
- 35
- Category
- Article
- ISSN
- 0741-0395
No coin nor oath required. For personal study only.
✦ Synopsis
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Twostage regression analysis, sometimes referred to as residual-or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residualoutcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (r 2 SC ). For example, for r 2 SC ¼ 0:0, 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under r 2 SC ¼ 0:0, the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided.