The processes of gene duplication, loss, and lineage sorting can result in incongruence between the phylogenies of genes and those of species. This incongruence complicates the task of inferring the latter from the former. We describe the use of reconciled trees to reconstruct the history of a gene
Inferring species phylogenies from multiple genes: Concatenated sequence tree versus consensus gene tree
✍ Scribed by Sudhindra R. Gadagkar; Michael S. Rosenberg; Sudhir Kumar
- Publisher
- John Wiley and Sons
- Year
- 2005
- Tongue
- English
- Weight
- 290 KB
- Volume
- 304B
- Category
- Article
- ISSN
- 1552-5007
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
Phylogenetic trees from multiple genes can be obtained in two fundamentally different ways. In one, gene sequences are concatenated into a super‐gene alignment, which is then analyzed to generate the species tree. In the other, phylogenies are inferred separately from each gene, and a consensus of these gene phylogenies is used to represent the species tree. Here, we have compared these two approaches by means of computer simulation, using 448 parameter sets, including evolutionary rate, sequence length, base composition, and transition/transversion rate bias. In these simulations, we emphasized a worst‐case scenario analysis in which 100 replicate datasets for each evolutionary parameter set (gene) were generated, and the replicate dataset that produced a tree topology showing the largest number of phylogenetic errors was selected to represent that parameter set. Both randomly selected and worst‐case replicates were utilized to compare the consensus and concatenation approaches primarily using the neighbor‐joining (NJ) method. We find that the concatenation approach yields more accurate trees, even when the sequences concatenated have evolved with very different substitution patterns and no attempts are made to accommodate these differences while inferring phylogenies. These results appear to hold true for parsimony and likelihood methods as well. The concatenation approach shows >95% accuracy with only 10 genes. However, this gain in accuracy is sometimes accompanied by reinforcement of certain systematic biases, resulting in spuriously high bootstrap support for incorrect partitions, whether we employ site, gene, or a combined bootstrap resampling approach. Therefore, it will be prudent to report the number of individual genes supporting an inferred clade in the concatenated sequence tree, in addition to the bootstrap support. J. Exp. Zool.(Mol. Dev. Evol.) 304B:000–000, 2005. © 2005 Wiley‐Liss, Inc.
📜 SIMILAR VOLUMES
Paralogy is a pervasive problem in trying to use nuclear gene sequences to infer species phylogenies. One strategy for dealing with this problem is to infer species phylogenies from gene trees using reconciled trees, rather than directly from the sequences themselves. In this approach, the optimal s
Toward the goal of recovering the phylogenetic relationships among elapid snakes, we separately found the shortest trees from the amino acid sequences for the venom proteins phospholipase A2 and the short neurotoxin, collectively representing 32 species in 16 genera. We then applied a method we term
Phylogenies based on mitochondrial DNA (mtDNA) may represent gene trees that may not be congruent with the equivalent species tree. One solution to this problem is to include additional, independent loci from the nuclear genome. Sequence data from the seventh intron of the beta-fibrinogen gene were