- Article
- Published:
Inferring ancient divergences requires genes with strong phylogenetic signals
Naturevolume 497, pages327–331 (2013)Cite this article
22kAccesses
93Altmetric
Subjects
Abstract
To tackle incongruence, the topological conflict between different gene trees, phylogenomic studies couple concatenation with practices such as rogue taxon removal or the use of slowly evolving genes. Phylogenomic analysis of 1,070 orthologues from 23 yeast genomes identified 1,070 distinct gene trees, which were all incongruent with the phylogeny inferred from concatenation. Incongruence severity increased for shorter internodes located deeper in the phylogeny. Notably, whereas most practices had little or negative impact on the yeast phylogeny, the use of genes or internodes with high average internode support significantly improved the robustness of inference. We obtained similar results in analyses of vertebrate and metazoan phylogenomic data sets. These results question the exclusive reliance on concatenation and associated practices, and argue that selecting genes with strong phylogenetic signals and demonstrating the absence of significant incongruence are essential for accurately reconstructing ancient divergences.
This is a preview of subscription content,access via your institution
Access options
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Dunn, C. W. et al. Broad phylogenomic sampling improves resolution of the animal tree of life.Nature452, 745–749 (2008)
Rokas, A., Kruger, D. & Carroll, S. B. Animal evolution and the molecular signature of radiations compressed in time.Science310, 1933–1938 (2005)
Philippe, H. et al. Phylogenomics revives traditional views on deep animal relationships.Curr. Biol.19, 706–712 (2009)
Schierwater, B. et al. Concatenated analysis sheds light on early metazoan evolution and fuels a modern “urmetazoon” hypothesis.PLoS Biol.7, e20 (2009)
Regier, J. C. et al. Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences.Nature463, 1079–1083 (2010)
Phillips, M. J., Delsuc, F. D. & Penny, D. Genome-scale phylogeny and the detection of systematic biases.Mol. Biol. Evol.21, 1455–1458 (2004)
Hess, J. & Goldman, N. Addressing inter-gene heterogeneity in maximum likelihood phylogenomic analysis: yeasts revisited.PLoS ONE6, e22783 (2011)
Degnan, J. H. & Rosenberg, N. A. Gene tree discordance, phylogenetic inference and the multispecies coalescent.Trends Ecol. Evol.24, 332–340 (2009)
Rokas, A. & Carroll, S. B. Bushes in the tree of life.PLoS Biol.4, e352 (2006)
Philippe, H. et al. Resolving difficult phylogenetic questions: why more sequences are not enough.PLoS Biol.9, e1000602 (2011)
Kocot, K. M. et al. Phylogenomics reveals deep molluscan relationships.Nature477, 452–456 (2011)
Smith, S. A. et al. Resolving the evolutionary relationships of molluscs with phylogenomic tools.Nature480, 364–367 (2011)
Bourlat, S. J. et al. Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida.Nature444, 85–88 (2006)
Delsuc, F., Brinkmann, H., Chourrout, D. & Philippe, H. Tunicates and not cephalochordates are the closest living relatives of vertebrates.Nature439, 965–968 (2006)
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies.Mol. Biol. Evol.23, 254–267 (2006)
Regier, J. C. et al. Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence.Syst. Biol.57, 920–938 (2008)
Regier, J. C. & Zwick, A. Sources of signal in 62 protein-coding nuclear genes for higher-level phylogenetics of arthropods.PLoS ONE6, e23408 (2011)
Talavera, G. & Castresana, J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.Syst. Biol.56, 564–577 (2007)
Rokas, A., Williams, B. L., King, N. & Carroll, S. B. Genome-scale approaches to resolving incongruence in molecular phylogenies.Nature425, 798–804 (2003)
Byrne, K. P. & Wolfe, K. H. The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species.Genome Res.15, 1456–1461 (2005)
Fitzpatrick, D. A., O'Gaora, P., Byrne, K. P. & Butler, G. Analysis of gene evolution and metabolic pathways using theCandida Gene Order Browser.BMC Genomics11, 290 (2010)
Scannell, D. R., Byrne, K. P., Gordon, J. L., Wong, S. & Wolfe, K. H. Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts.Nature440, 341–345 (2006)
Salichos, L. & Rokas, A. Evaluating ortholog prediction algorithms in a yeast model clade.PLoS ONE6, e18755 (2011)
Slot, J. C. & Rokas, A. MultipleGAL pathway gene clusters evolved independently and by different mechanisms in fungi.Proc. Natl Acad. Sci. USA107, 10136–10141 (2010)
Mossel, E. & Steel, M. A phase transition for a random cluster model on phylogenetic trees.Math. Biosci.187, 189–203 (2004)
Townsend, J. P., Su, Z. & Tekle, Y. I. Phylogenetic signal and noise: predicting the power of a data set to resolve phylogeny.Syst. Biol.61, 835–849 (2012)
Scannell, D. R. et al. The awesome power of yeast evolutionary genetics: new genome sequences and strain resources for theSaccharomyces sensu stricto genus.G31, 11–25 (2011)
Robinson, D. R. & Foulds, L. R. Comparison of phylogenetic trees.Math. Biosci.53, 131–147 (1981)
Farris, J. S., Kallersjo, M., Kluge, A. G. & Bult, C. Testing significance of incongruence.Cladistics10, 315–319 (1995)
Templeton, A. R. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and apes.Evolution37, 221–244 (1983)
Baker, R. H. & DeSalle, R. Multiple sources of character information and the phylogeny of Hawaiian drosophilids.Syst. Biol.46, 654–673 (1997)
Rodrigo, A. G., Kelly-Borges, M., Bergquist, P. G. & Bergquist, P. L. A randomisation test of the null hypothesis that two cladograms are sample estimates of a parametric phylogenetic tree.N. Z. J. Bot.31, 257–268 (1993)
Yu, Y., Degnan, J. H. & Nakhleh, L. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection.PLoS Genet.8, e1002660 (2012)
Hittinger, C. T., Rokas, A. & Carroll, S. B. Parallel inactivation of multipleGAL pathway genes and ecological diversification in yeasts.Proc. Natl Acad. Sci. USA101, 14144–14149 (2004)
Rokas, A. & Carroll, S. B. More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy.Mol. Biol. Evol.22, 1337–1344 (2005)
Jeffroy, O., Brinkmann, H., Delsuc, F. & Philippe, H. Phylogenomics: the beginning of incongruence?Trends Genet.22, 225–231 (2006)
Fitzpatrick, D. A., Logue, M. E., Stajich, J. E. & Butler, G. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis.BMC Evol. Biol.6, 99 (2006)
Liu, L., Yu, L., Pearl, D. K. & Edwards, S. V. Estimating species phylogenies using coalescence times among sequences.Syst. Biol.58, 468–477 (2009)
Felsenstein, J. Confidence limits on phylogenies: an approach using the bootstrap.Evolution39, 783–791 (1985)
Hittinger, C. T., Johnston, M., Tossberg, J. T. & Rokas, A. Leveraging skewed transcript abundance by RNA-seq to increase the genomic depth of the tree of life.Proc. Natl Acad. Sci. USA107, 1476–1481 (2010)
Kumar, S., Filipski, A. J., Battistuzzi, F. U., Kosakovsky Pond, S. L. & Tamura, K. Statistics and truth in phylogenomics.Mol. Biol. Evol.29, 457–472 (2012)
Cunningham, C. W. Can three incongruence tests predict when data should be combined?Mol. Biol. Evol.14, 733–740 (1997)
Katoh, K. & Toh, H. Recent developments in the MAFFT multiple sequence alignment program.Brief. Bioinform.9, 286–298 (2008)
Abascal, F., Zardoya, R. & Posada, D. Prottest: selection of best-fit models of protein evolution.Bioinformatics21, 2104–2105 (2005)
Stamatakis, A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.Bioinformatics22, 2688–2690 (2006)
Dujon, B. Yeast evolutionary genomics.Nature Rev. Genet.11, 512–524 (2010)
Scannell, D. R., Butler, G. & Wolfe, K. H. Yeast genome evolution-the origin of the species.Yeast24, 929–942 (2007)
Hall, C., Brachat, S. & Dietrich, F. S. Contribution of horizontal gene transfer to the evolution ofSaccharomyces cerevisiae.Eukaryot. Cell4, 1102–1115 (2005)
League, G. P., Slot, J. C. & Rokas, A. TheASP3 locus inSaccharomyces cerevisiae originated by horizontal gene transfer fromWickerhamomyces.FEMS Yeast Res.12, 859–863 (2012)
Novo, M. et al. Eukaryote-to-eukaryote gene transfer events revealed by the genome sequence of the wine yeastSaccharomyces cerevisiae EC1118.Proc. Natl Acad. Sci. USA106, 16333–16338 (2009)
Ashburner, M. et al. Gene ontology: tool for the unification of biology.Nature Genet.25, 25–29 (2000)
Beissbarth, T. & Speed, T. P. GOstat: find statistically overrepresented Gene Ontologies within a group of genes.Bioinformatics20, 1464–1465 (2004)
Whelan, S. & Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.Mol. Biol. Evol.18, 691–699 (2001)
Zwickl, D. J.Genetic Algorithm Approaches for the Phylogenetic Analysis of Large Biological Sequence Datasets under the Maximum Likelihood Criterion. Ph.D. thesis, Univ. Texas at Austin. (2006)
Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models.Bioinformatics19, 1572–1574 (2003)
Bryant, D. inBioconsensus (eds Janowitz, M. et al.) 163–184 (American Mathematical Society and DIMACS, 2003)
Felsenstein, J.Inferring Phylogenies. (Sinauer, 2003)
Alix, B., Boubacar, D. A. & Vladimir, M. T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks.Nucleic Acids Res.40, W573–W579 (2012)
Kuhner, M. K. & Felsenstein, J. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.Mol. Biol. Evol.11, 459–468 (1994)
Holland, B. R., Huber, K. T., Moulton, V. & Lockhart, P. J. Using consensus networks to visualize contradictory evidence for species phylogeny.Mol. Biol. Evol.21, 1459–1461 (2004)
Shannon, C. E. A mathematical theory of communication.Bell Syst. Tech. J.27, 379–423 (1948)
Rogozin, I. B., Wolf, Y. I., Carmel, L. & Koonin, E. V. Ecdysozoan clade rejected by genome-wide analysis of rare amino acid replacements.Mol. Biol. Evol.24, 1080–1090 (2007)
Belinky, F., Cohen, O. & Huchon, D. Large-scale parsimony analysis of metazoan indels in protein-coding genes.Mol. Biol. Evol.27, 441–451 (2010)
Acknowledgements
We thank K. Polzin for providing a script that identified alignment sites that contained single substitutions between amino acids that differ in their physicochemical properties. We thank members of the Rokas laboratory and B. O’Meara for valuable comments on this work. This work was conducted in part using the resources of the Advanced Computing Center for Research and Education at Vanderbilt University. This work was supported by the National Science Foundation (DEB-0844968).
Author information
Authors and Affiliations
Department of Biological Sciences, Vanderbilt University, Nashville, 37235, Tennessee, USA
Leonidas Salichos & Antonis Rokas
- Leonidas Salichos
You can also search for this author inPubMed Google Scholar
- Antonis Rokas
You can also search for this author inPubMed Google Scholar
Contributions
L.S. and A.R. conceived and designed experiments; L.S. carried out experiments; L.S. and A.R. analysed data and wrote the paper.
Corresponding author
Correspondence toAntonis Rokas.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
This file contains Supplementary Tables 1-2 and Supplementary Figures 1-17. (PDF 1109 kb)
Rights and permissions
About this article
Cite this article
Salichos, L., Rokas, A. Inferring ancient divergences requires genes with strong phylogenetic signals.Nature497, 327–331 (2013). https://doi.org/10.1038/nature12130
Received:
Accepted:
Published:
Issue Date:
This article is cited by
Highly-multiplexed and efficient long-amplicon PacBio and Nanopore sequencing of hundreds of full mitochondrial genomes
- Benjamin R. Karin
- Selene Arellano
- Jimmy A. McGuire
BMC Genomics (2023)
Assessing sequence heterogeneity in Chlorellaceae DNA barcode markers for phylogenetic inference
- Ee Bhei Wong
- Nurhaida Kamaruddin
- Raja Farhana R. Khairuddin
Journal of Genetic Engineering and Biotechnology (2023)
Incongruence in the phylogenomics era
- Jacob L. Steenwyk
- Yuanning Li
- Antonis Rokas
Nature Reviews Genetics (2023)
Novel determinants of cell size homeostasis in the opportunistic yeast Candida albicans
- Julien Chaillot
- Michael A. Cook
- Adnane Sellam
Current Genetics (2023)
De novo genome assembly and analysis of Zalaria sp. Him3, a novel fructooligosaccharides producing yeast
- Jun Yoshikawa
- Minenosuke Matsutani
- Kenji Maehashi
BMC Genomic Data (2022)