๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Direct analysis of unphased SNP genotype data in population-based association studies via Bayesian partition modelling of haplotypes

โœ Scribed by Andrew P. Morris


Publisher
John Wiley and Sons
Year
2005
Tongue
English
Weight
212 KB
Volume
29
Category
Article
ISSN
0741-0395

No coin nor oath required. For personal study only.

โœฆ Synopsis


We describe a novel method for assessing the strength of disease association with single nucleotide polymorphisms (SNPs) in a candidate gene or small candidate region, and for estimating the corresponding haplotype relative risks of disease, using unphased genotype data directly. We begin by estimating the relative frequencies of haplotypes consistent with observed SNP genotypes. Under the Bayesian partition model, we specify cluster centres from this set of consistent SNP haplotypes. The remaining haplotypes are then assigned to the cluster with the "nearest" centre, where distance is defined in terms of SNP allele matches. Within a logistic regression modelling framework, each haplotype within a cluster is assigned the same disease risk, reducing the number of parameters required. Uncertainty in phase assignment is addressed by considering all possible haplotype configurations consistent with each unphased genotype, weighted in the logistic regression likelihood by their probabilities, calculated according to the estimated relative haplotype frequencies. We develop a Markov chain Monte Carlo algorithm to sample over the space of haplotype clusters and corresponding disease risks, allowing for covariates that might include environmental risk factors or polygenic effects. Application of the algorithm to SNP genotype data in an 890-kb region flanking the CYP2D6 gene illustrates that we can identify clusters of haplotypes with similar risk of poor drug metaboliser (PDM) phenotype, and can distinguish PDM cases carrying different high-risk variants. Further, the results of a detailed simulation study suggest that we can identify positive evidence of association for moderate relative disease risks with a sample of 1,000 cases and 1,000 controls.


๐Ÿ“œ SIMILAR VOLUMES


Streamlined analysis of pooled genotype
โœ Valentina Moskvina; Nadine Norton; Nigel Williams; Peter Holmans; Michael Owen; ๐Ÿ“‚ Article ๐Ÿ“… 2005 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 130 KB ๐Ÿ‘ 1 views

## Abstract Several groups have developed methods for estimating allele frequencies in DNA pools as a fast and cheap way for detecting allelic association between genetic markers and disease. To obtain accurate estimates of allele frequencies, a correction factor __k__ for the degree to which measu