327Accesses
3Altmetric
Abstract
The character compatibility approach, which removes all homoplasic characters and involves finding the largest clique of compatible characters in a dataset, in principle, provides a powerful means for obtaining correct topology in difficult to resolve cases. However, the usefulness of this approach to generalized molecular sequence data for phylogeny determination has not been studied in the past. We have used this approach to determine the topology of 23 proteobacterial species (6 each of α-, β- and γ-, 3 δ-, and 2 ε-proteobacteria) using sequence data for 10 conserved proteins (Hsp60, Hsp70, EF-Tu, EF-G, alanyl-tRNA synthetase, RecA, GyrA, GyrB, RpoB and RpoC). All sites in the sequence alignments of these proteins where only two amino acids were found, with each amino acid present in at least two species, were selected. Mutual compatibility determination on these binary state sites was carried out by two means. In one case, all of these sites were combined into a large dataset (Set A; 957 characters) prior to compatibility analysis. In the second case, compatibility analysis was carried out on characters from individual proteins and all compatible sites were combined into a large dataset (Set B; 398 characters) for further studies. Upon compatibility analyses, the largest cliques that were obtained from Sets A and B consisted of 337 and 323 compatible characters, respectively. In these cliques, all proteobacterial subgroups were clearly distinguished and branching orders of most of the species were also resolved. The ε-proteobacteria exhibited the earliest branching, whereas the β- and γ-subgroups were found to have emerged last. The relative placement of the α- and δ-subgroups, however, was not resolved. The topology of these species was also determined based on 16S rRNA sequences and a concatenated dataset of sequences for all 10 proteins by means of neighbor-joining, maximum likelihood, and maximum parsimony methods. In the protein trees, all proteobacterial groups were reliably resolved and they branched in the following order: (ε(δ(α(β,γ)))). However, in the rRNA trees, the γ- and β-subgroups exhibited polyphyletic branching and many internal nodes were not resolved. These results indicate that the character compatibility analysis using generalized molecular sequence data provides a powerful means for evolutionary studies. Based on molecular sequences, it should be possible to obtain very large datasets of compatible characters that should prove very helpful in clarifying difficult to resolve phylogenetic relationships.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.



Similar content being viewed by others
References
Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000) A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290:972–977
Beiko RG, Harlow TJ, Ragan MA (2005) Highways of gene sharing in prokaryotes. Proc Natl Acad Sci USA 102:14332–14337
Bron C, Lerbosch J (1973) Alogrithm 457:Finding all cliques of an undirected graph. Commun Assoc Comput Mach 16:575–577
Brown JR, Douady CJ, Italia MJ, Marshall WE, Stanhope MJ (2001) Universal trees based on large combined protein sequence data sets. Nat Genet 28:281–285
Buneman P (1971) The recovery of trees from measures of dissimilarity. In: Hodson FR, Kendall DG, Tautu P (eds) Mathematics in the archaeological and historical sciences. Edinburgh University Press, Edinburgh, pp 387–395
Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O’Connell MJ, Pentony MM, Travers SA, Wilkinson M, McInerney JO (2004) Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc Biol Sci 271:2551–2558
Daubin V, Gouy M, Perriere G (2002) A phylogenomic approach to bacterial phylogeny:evidence of a core of genes sharing a common history. Genome Res 12:1080–1090
De Ley J (1992) The Proteobacteria: ribosomal RNA cistron similarities and bacterial taxonomy. In: Balows A, Trüper HG, Dworkin M, Harder W, Schleifer KH (eds) The prokaryotes. Springer-Verlag, New York, pp 2111–2140
Delsuc F, Brinkmann H, Philippe H (2005) Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet 6:361–375
Eisen JA (1995) The RecA protein as a model molecule for molecular systematic studies of bacteria:comparison of trees of RecAs and 16S rRNAs from the same species. J Mol Evol 41:1105–1123
Erwin DH, Davidson EH (2002) The last common bilaterian ancestor. Development 129:3021–3032
Estabrook GF, McMorris FR (1980) When is one estimate of evolutionary relationship a refinement of another? J Math Biol 10:367–373
Estabrook GF, Johnson CS Jr, McMorris FR (1976) A mathematical foundation for the analysis of cladistic character compatibility. Math Biosci 29:181–187
Felsenstein J (1978) Cases in which parsimony and compatibility methods will be positively misleading. Syst Zool 27:401–410
Felsenstein J (1981a) A likelihood approach to character weighting and what it tells us about parsimony and compatibility. Biol J Linn Soc 16:183–196
Felsenstein J (1981b) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376
Felsenstein J (1985) Confidence limits in phylogenies: an approach using the bootstap. Evolution 39:783–791
Felsenstein J (1993) PHYLIP, version 3.5c. University of Washington, Seattle
Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland, MA
Fitch WM (1971) Toward defining the course of evolution:minimum change for a specified tree topology. Syst Zool 20:406–416
Fitch WM (1975) Towards finding the tree of maximum parsimony. In: Estabrook GF (ed) Proceedings of the Eighth International Conference on Numerical Taxonomy. W. H. Freeman, San Francisco, pp 189–230
Gogarten JP, Doolittle WF, Lawrence JG (2002) Prokaryotic evolution in light of gene transfer. Mol Biol Evol 19:2226–2238
Gophna U, Doolittle WF, Charlebois RL (2005) Weighted genome trees:refinements and applications. J Bacteriol 187:1305–1316
Griffiths E, Gupta RS (2004) Signature sequences in diverse proteins provide evidence for the late divergence of the orderAquificales. Int Microbiol 7:41–52
Griffiths E, Ventresca MS, Gupta RS (2006) BLAST screening of chlamydial genomes to identify signature proteins that are unique for theChlamydiales, Chlamydiaceae, Chlamydophila andChlamydia groups of species. BMC Genomics 7:14
Gupta RS (1995) Phylogenetic analysis of the 90 kD heat shock family of protein sequences and an examination of the relationship among animals, plants, and fungi species. Mol Biol Evol 12:1063–1073
Gupta RS (1998) Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev 62:1435–1491
Gupta RS (2000) The phylogeny of Proteobacteria: relationships to other eubacterial phyla and eukaryotes. FEMS Microbiol Rev 24:367–402
Gupta RS (2001) The branching order and phylogenetic placement of species from completed bacterial genomes, based on conserved indels found in various proteins. Inter Microbiol 4:187–202
Gupta RS (2003) Evolutionary relationships among photosynthetic bacteria. Photosynth Res 76:173–183
Gupta RS (2005) Protein signatures distinctive of Alpha proteobacteria and its subgroups and a model for alpha proteobacterial evolution. Crit Rev Microbiol 31:135
Gupta RS (2006) Molecular signatures (unique proteins and conserved Indels) that are specific for the epsilon proteobacteria (Campylobacterales) BMC Genomics 7:167
Gupta RS, Griffiths E (2002) Critical issues in bacterial phylogenies. Theor Popul Biol 61:423–434
Harris JK, Kelley ST, Spiegelman GB, Pace NR (2003) The genetic core of the universal ancestor. Genome Res 13:407–412
Hasegawa M, Fujiwara M (1993) Relative efficiencies of the maximum likelihood, maximum parsimony, and neighbor-joining methods for estimating protein phylogeny. Mol Phylogenet Evol 2:1–5
Huelsenbeck JP, Bollback JP (2001) Empirical and hierarchical Bayesian estimation of ancestral states. Syst Biol 50:351–366
Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ (1998) Multiple sequence alignment with Clustal x. Trends Biochem Sci 23:403–405
Kainth P, Gupta RS (2005) Signature proteins that are distinctive of alpha proteobacteria. BMC Genomics 6:94
Kannan S, Warnow TJ (1995) Inferring evolutionary history from DNA sequences. SIAM J Comput 23:713–737
Kersters K, Devos P, Gillis M, Vandamme P, Stackebrandt E (2003) Introduction to the proteobacteria. In: Dworkin M (ed) The prokaryotes:an evolving electronic resource for the microbiological community. Springer-Verlag, New York
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge
Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea. J Mol Evol 29:170–179
Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163
Kunisawa T (2001) Gene arrangements and phylogeny in the class Proteobacteria. J Theor Biol 213:9–19
Kunisawa T (2006) Dichotomy of major bacterial phyla inferred from gene arrangement comparisons. J Theor Biol 239:367–375
Lake JA, Rivera MC (2004) Deriving the genomic tree of life in the presence of horizontal gene transfer:conditioned reconstruction. Mol Biol Evol 21:681–690
Le Quesne WJ (1969) A method of selection of characters in numerical taxonomy. Syst Zool 18:201–205
Le Quesne WJ (1975) The uniquely evolved character concept and its cladistic application. Syst Zool 23:513–517
Ludwig W, Klenk H-P (2001) Overview: a phylogenetic backbone and taxonomic framework for prokaryotic systamatics. In: Boone DR, Castenholz RW (eds) Bergey’s manual of systematic bacteriology. Springer-Verlag, Berlin, pp 49–65
Maidak BL, Cole JR, Lilburn TG, Parker CT, Jr., Saxman PR, Farris RJ, Garrity GM, Olsen GJ, Schmidt TM, Tiedje JM (2001) The RDP-II (Ribosomal Database Project). Nucleic Acids Res 29:173–174
Meacham CA (1994) Phylogenetic relationships at the basal radiation of angiosperms: further study by probability of character compatibilityy. Syst Bot 19:506–522
Meacham CA, Estabrook GF (1985) Comaptibility methods in systematics. Annu Rev Ecol Syst 16:431–446
Nielsen C (2003) Defining phyla: morphological and molecular clues to metazoan evolution. Evol Dev 5:386–393
O’Keefe FR, Wagner PJ (2001) Inferring and testing hypthoses of cladistic character dependence by using character compatibility. Syst Bot 50:657–675
Ochman H (2001) Lateral and oblique gene transfer. Curr Opin Genet Dev 11:616–619
Olsen GJ, Woese CR, Overbeek R (1994) The winds of (evolutionary) change: breathing new life into microbiology. J Bacteriol 176:1–6
Penny D (1976) Criteria for optimising phylogenetic trees and the problem of determining the root of a tree. J Mol Evol 8:95–116
Pisani D (2004) Identifying and removing fast-evolving sites using compatibility analysis: an example from the Arthropoda. Syst Biol 53:978–989
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Schmidt HA, Strimmer K, Vingron M, von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504
Semple C, Steel M (2003) Phylogenetics. Oxford University Press, Oxford
Sneath PHA (2001) Numerical taxonomy. In: Boone DR, Castenholz RW (eds) Bergey’s manual of systematic bacteriology. Springer-Verlag, Berlin, pp 39–42
Sneath PHA, Sackin MJ, Ambler RP (1975) Detecting evolutionary incompatibilities from protein sequences. Syst Zool 24:311–332
Stackebrandt E, Murray RGE, Trüper HG (1988)Proteobacteria classis nov., a name for the phylogenetic taxon that includes the “purple bacteria and their relatives.” Int J Syst Bacteriol 38:321–325
Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526
Tateno Y, Takezei N, Nei M (1994) Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum parsimony methods when substitution rate varies with site. Mol Biol Evol 12:261–277
Van de Peer Y, De Wachter R (1994) TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Comput Appl Biosci 10:569–570
Whelan S, Goldman N (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18:691–699
Wilkinson M (2001) PICA 4.0: software and documentation. Department of Zoology, Natural History Museum, London
Wilkinson M, Cotton JA, Creevey C, Eulenstein O, Harris SR, Lapointe FJ, Levasseur C, McInerney JO, Pisani D, Thorley JL (2005) The shape of supertrees to come:tree shape related properties of fourteen supertree methods. Syst Biol 54:419–431
Wilmotte A, Herdman M (2001) Phylogenetic relationships among the cyanobacteria based on 16S rRNA sequences. In: Boone DR, Castenholz RW (eds) Bergey’s manual of systematic bacteriology. Springer, New York, pp 487–493
Wilson EO (1965) A consistency test for phylogenies based on contemporaneous species. Syst Zool 14:214–220
Acknowledgments
We thank Yan Li for writing the computer algorithms for the DUALSITE and the HARMONY programs. The work from R.S.G.’s lab, including support for Yan Li, was through a grant from the National Science and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Canada, L8N 3Z5
Radhey S. Gupta
Department of Infection, Immunity and Inflammation, University of Leicester, Leicester, England, LE1 9HN
Peter H. A. Sneath
- Radhey S. Gupta
You can also search for this author inPubMed Google Scholar
- Peter H. A. Sneath
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toRadhey S. Gupta.
Additional information
[Reviewing Editor: Dr. Yves Van de Peer]
Rights and permissions
About this article
Cite this article
Gupta, R.S., Sneath, P.H.A. Application of the Character Compatibility Approach to Generalized Molecular Sequence Data: Branching Order of the Proteobacterial Subdivisions.J Mol Evol64, 90–100 (2007). https://doi.org/10.1007/s00239-006-0082-2
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative