𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Data mining of public SNP databases for the selection of intragenic SNPs

✍ Scribed by Jan Aerts; Yves Wetzels; Nadine Cohen; Jeroen Aerssens


Book ID
102258634
Publisher
John Wiley and Sons
Year
2002
Tongue
English
Weight
172 KB
Volume
20
Category
Article
ISSN
1059-7794

No coin nor oath required. For personal study only.

✦ Synopsis


Different strategies to search public single nucleotide polymorphism (SNP) databases for intragenic SNPs were evaluated. First, we assembled a strategy to annotate SNPs onto candidate genes based on a BLAST search of public SNP databases (Intragenic SNP Annotation by BLAST, ISAB). Only BLAST hits that complied with stringent criteria according to 1) percentage identity (minimum 98%), 2) BLAST hit length (the hit covers at least 98% of the length of the SNP entry in the database, or the hit is longer than 250 base pairs), and 3) location in non-repetitive DNA, were considered as valid SNPs. We assessed the intragenic context and redundancy of these SNPs, and demonstrated that the SNP content of the dbSNP and HGBASE/HGVbase databases are highly complementary but also overlap significantly. Second, we assessed the validity of intragenic SNP annotation available on the dbSNP and HGVbase websites by comparison with the results of the ISAB strategy. Only a minority of all annotated SNPs was found in common between the respective public SNP database websites and the ISAB annotation strategy. A detailed analysis was performed aiming to explain this discrepancy. As a conclusion, we recommend the application of an independent strategy (such as ISAB) to annotate intragenic SNPs, complementary to the annotation provided at the dbSNP and HGVbase websites. Such an approach might be useful in the selection process of intragenic SNPs for genotyping in genetic studies. Hum Mutat 20:162-173, 2002.


πŸ“œ SIMILAR VOLUMES


Principal component analysis for selecti
✍ Benjamin D. Horne; Nicola J. Camp πŸ“‚ Article πŸ“… 2003 πŸ› John Wiley and Sons 🌐 English βš– 127 KB

## Abstract Candidate gene association studies often utilize one single nucleotide polymorphism (SNP) for analysis, with an initial report typically not being replicated by subsequent studies. The failure to replicate may result from incomplete or poor identification of disease‐related variants or