Genetic epidemiology is faced with mapping complex traits to genes with relatively small effects whose phenotypes may be modulated by temporal factors. To do this, detailed and accurate data must be available on families, perhaps collected over time. The Framingham Heart Study data supplied to Genet
Genotyping errors, pedigree errors, and missing data
โ Scribed by Anthony L. Hinrichs; Brian K. Suarez
- Publisher
- John Wiley and Sons
- Year
- 2005
- Tongue
- English
- Weight
- 100 KB
- Volume
- 29
- Category
- Article
- ISSN
- 0741-0395
No coin nor oath required. For personal study only.
โฆ Synopsis
Our group studied the effects of genotyping errors, pedigree errors, and missing data on a wide range of techniques, with a focus on the role of single-nucleotide polymorphisms (SNPs). Half of our group used simulated data, and half of our group used data from the Collaborative Study on the Genetics of Alcoholism (COGA). The simulated data had no missing genotypes and no genotyping errors, so our group, as a whole, removed data and introduced artificial errors to study the robustness of various techniques. Our teams showed that genotyping errors are less detectable and may have a greater impact on SNPs than on microsatellites, but recently developed methods that account for genotyping errors help reduce false positives, and the assumptions of these methods appear to be supported by observations from repeated genotyping. The ability to detect linkage disequilibrium (LD) was also substantially reduced by missing data; this in turn could affect tagging SNPs chosen to generate haplotypes. In the COGA sample, genotyping measurements were repeated in three ways. First, full-genome screens were performed on three sets of markers: 328 microsatellites, 11,560 SNPs from the Affymetrix GeneChip Mapping 10 K Array marker set, and 4,720 SNPs from the Illumina Linkage III panel. Second, the entire Affymetrix marker set was typed on the same 184 individuals by two different laboratories. Finally, the Affymetrix and Illumina marker panels had 94 SNPs in common. Our teams showed that both SNPs and microsatellites can be readily used to identify pedigree errors, and that SNPs have fewer genotyping errors and a low inconsistency rate. However, a fairly high rate of no-calls, especially for the Affymetrix platform, suggests that the inconsistency rate may be higher than observed.
๐ SIMILAR VOLUMES
## Abstract Because most multipoint linkage analysis programs currently assume linkage equilibrium between markers when inferring parental haplotypes, ignoring linkage disequilibrium (LD) may inflate the Type I error rate. We investigated the effect of LD on the Type I error rate and power of nonpa
## Abstract Inference of haplotypes is important in genetic epidemiology studies. However, all large genotype data sets have errors due to the use of inexpensive genotyping machines that are fallible and shortcomings in genotyping scoring softwares, which can have an enormous impact on haplotype in
## Abstract Mapping complex traits or phenotypes with small genetic effects, whose phenotypes may be modulated by temporal trends in families are challenging. Detailed and accurate data must be available on families, whether or not the data were collected over time. Missing data complicate matters
As projects progress from pilot studies with few simple variables and small samples, the research process as a whole becomes qualitatively more complex and subject to an array of contamination by errors and mistakes. Data usually undergo a series of manipulations (e.g., recording, computer entry, tr
The detection of genotyping errors, based on apparent Mendelian incompatibilities in a sample of sib-pairs, is a complicated problem. In the case of a single marker and unknown parental genotypes, all combinations of sib-pair genotypes are self-consistent. Moreover, the observed deviation from equil