✦ LIBER ✦

Haplotype Inference for Population Data with Genotyping Errors

✍ Scribed by Wensheng Zhu; Anthony Y. C. Kuk; Jianhua Guo

Publisher: John Wiley and Sons
Year: 2009
Tongue: English
Weight: 265 KB
Volume: 51
Category: Article
ISSN: 0323-3847
DOI: 10.1002/bimj.200800215

No coin nor oath required. For personal study only.

✦ Synopsis

Abstract

Inference of haplotypes is important in genetic epidemiology studies. However, all large genotype data sets have errors due to the use of inexpensive genotyping machines that are fallible and shortcomings in genotyping scoring softwares, which can have an enormous impact on haplotype inference. In this article, we propose two novel strategies to reduce the impact induced by genotyping errors in haplotype inference. The first method makes use of double sampling. For each individual, the “GenoSpectrum” that consists of all possible genotypes and their corresponding likelihoods are computed. The second method is a genotype clustering algorithm based on multi‐genotyping data, which also assigns a “GenoSpectrum” for each individual. We then describe two hybrid EM algorithms (called DS‐EM and MG‐EM) that perform haplotype inference based on “GenoSpectrum” of each individual obtained by double sampling and multi‐genotyping data. Both simulated data sets and a quasi real‐data set demonstrate that our proposed methods perform well in different situations and outperform the conventional EM algorithm and the HMM algorithm proposed by Sun, Greenwood, and Neal (2007, Genetic Epidemiology 31, 937–948) when the genotype data sets have errors.

📜 SIMILAR VOLUMES

Haplotype sharing transmission/disequili

Haplotype sharing transmission/disequilibrium tests that allow for genotyping errors

✍ Qiuying Sha; Jianping Dong; Renfang Jiang; Huann-Sheng Chen; Shuanglin Zhang 📂 Article 📅 2005 🏛 John Wiley and Sons 🌐 English ⚖ 156 KB

## Abstract The present study introduces new Haplotype Sharing Transmission/Disequilibrium Tests (HS‐TDTs) that allow for random genotyping errors. We evaluate the type I error rate and power of the new proposed tests under a variety of scenarios and perform a power comparison among the proposed te

Estimating haplotype-disease association

Estimating haplotype-disease associations with pooled genotype data

✍ D. Zeng; D.Y. Lin 📂 Article 📅 2004 🏛 John Wiley and Sons 🌐 English ⚖ 282 KB

The genetic dissection of complex human diseases requires large-scale association studies which explore the population associations between genetic variants and disease phenotypes. DNA pooling can substantially reduce the cost of genotyping assays in these studies, and thus enables one to examine a

Inference of Haplotype Effects in Case-C

Inference of Haplotype Effects in Case-Control Studies Using Unphased Genotype and Environmental Data

✍ Xiaowu Chen; Zhaohai Li 📂 Article 📅 2008 🏛 John Wiley and Sons 🌐 English ⚖ 123 KB 👁 2 views

Understanding the accuracy of statistica

Understanding the accuracy of statistical haplotype inference with sequence data of known phase

✍ Aida M. Andrés; Andrew G. Clark; Lawrence Shimmin; Eric Boerwinkle; Charles F. S 📂 Article 📅 2007 🏛 John Wiley and Sons 🌐 English ⚖ 368 KB

## Abstract Statistical methods for haplotype inference from multi‐site genotypes of unrelated individuals have important application in association studies and population genetics. Understanding the factors that affect the accuracy of this inference is important, but their assessment has been rest

Robust Bayesian Inference for Seemingly

Robust Bayesian Inference for Seemingly Unrelated Regressions with Elliptical Errors

✍ Vee Ming Ng 📂 Article 📅 2002 🏛 Elsevier Science 🌐 English ⚖ 75 KB

Bayesian inference is considered for the seemingly unrelated regressions with an elliptically contoured error distribution. We show that the posterior distribution of the regression parameters and the predictive distribution of future observations under elliptical errors assumption are identical to

A general program for estimation of hapl

A general program for estimation of haplotype frequencies from population diploid data

✍ S.Olesen Larsen 📂 Article 📅 1979 🏛 Elsevier Science ⚖ 560 KB

The program which is written in FORTRAN estimates haplotype frequencies in two-locus and three-locus genetic systems from population diploid data. It is based on the gene counting method which leads to maximum likelihood estimates, and can be used whenever the possible antigens (one or more) on each