## Abstract The identification and characterization of genes that influence the risk of common, complex multifactorial diseases, primarily through interactions with other genes and other environmental factors, remains a statistical and computational challenge in genetic epidemiology. This challenge
Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology
✍ Scribed by Alison A. Motsinger-Reif; Scott M. Dudek; Lance W. Hahn; Marylyn D. Ritchie
- Publisher
- John Wiley and Sons
- Year
- 2008
- Tongue
- English
- Weight
- 384 KB
- Volume
- 32
- Category
- Article
- ISSN
- 0741-0395
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
The detection of genotypes that predict common, complex disease is a challenge for human geneticists. The phenomenon of epistasis, or gene‐gene interactions, is particularly problematic for traditional statistical techniques. Additionally, the explosion of genetic information makes exhaustive searches of multilocus combinations computationally infeasible. To address these challenges, neural networks (NN), a pattern recognition method, have been used. One limitation of the NN approach is that its success is dependent on the architecture of the network. To solve this, machine‐learning approaches have been suggested to evolve the best NN architecture for a particular data set. In this study we provide a detailed technical description of the use of grammatical evolution to optimize neural networks (GENN) for use in genetic association studies. We compare the performance of GENN to that of a previous machine‐learning NN application—genetic programming neural networks in both simulated and real data. We show that GENN greatly outperforms genetic programming neural networks in data sets with a large number of single nucleotide polymorphisms. Additionally, we demonstrate that GENN has high power to detect disease‐risk loci in a range of high‐order epistatic models. Finally, we demonstrate the scalability of the GENN method with increasing numbers of variables—as many as 500,000 single nucleotide polymorphisms. Genet. Epidemiol. 2008. © 2008 Wiley‐Liss, Inc.
📜 SIMILAR VOLUMES
## Abstract Estimation and testing of genetic effects (genotype relative risks) are often performed conditionally on parental genotypes, using data from case‐parent trios. This strategy avoids having to estimate nuisance parameters such as parental mating type frequencies, and also avoids generatin