Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

A Large-Scale Comparison of Genomic Sequences: One Promising Approach

  • Published:
Acta Biotheoretica Aims and scope Submit manuscript

Abstract

We introduce a novel, linguistic-like method of genome analysis. We propose a natural approach to characterizing genomic sequences based on occurrences of fixed length words from a predefined, sufficiently large set of words (strings over the alphabet {A, C, G, T} ). A measure based on this approach is called compositional spectrum and is actually a histogram of imperfect word occurrences. Our results assert that the compositional spectrum is an overall characteristic of a long sequence i.e., a complete genome or an uninterrupted part of a chromosome. This attribute is manifested in the similarity of spectra obtained on different stretches of the same genome, and simultaneously in a broad range of dissimilarities between spectral representations of different genomes. High flexibility characterizes this approach due to imperfect matching and as a result sets of relatively long words can be considered. The proposed approach may have various applications in intra- and intergenomic sequence comparisons.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  • Brendel, V., J.S. Beckmann and E.N. Trifonov (1986). Linguistics of nucleotide sequences: morphology and comparison of vocabularies. Journal of Biomolecular Structure and Dynamics 4: 11–21.

    Google Scholar 

  • Dyer, M., A. Frieze and S. Suen (1994). The probability of unique solutions of sequencing by hybridization. Journal of Computational Biology 1: 105–110.

    Google Scholar 

  • Karlin, S. (1998). Global dinucleotide signatures and analysis of genomic heterogeneity. Current Opinion in Microbiology 1: 598–610.

    Google Scholar 

  • Karlin, S. and J. Mrazek (1997). Compositional differences within and between eukaryotic genomes. Proceedings of the National Academy of Sciences of the United States of America 94: 10227–10232.

    Google Scholar 

  • Karlin, S. and C. Burge (1995). Dinucleotide relative abundance extremes: a genomic signature. Trends in Genetics 11: 283–290.

    Google Scholar 

  • Kendall, M. G. (1970). Rank Correlation Methods. Charles Griffin & Co., Ltd, London.

    Google Scholar 

  • Kendall, M. G. and A. Stuart (1967). Inference and Relationship, 2. Charles Griffin & Co., Ltd, London.

    Google Scholar 

  • Kirzhner, V.M., A.B. Korol, A. Bolshoy and E. Nevo (2000). Extensive Sets of Words Reveal Large-Scale Genome Organization. Poster in Genomes 2000: International Conference on Microbial and Model Genomes, Paris, France.

  • Kirzhner, V.M., A.B. Korol, A. Bolshoy and E. Nevo (2002). Compositional spectrum — revealing patterns for genomic sequence characterization and comparison. Physica A 312: 447–457.

    Google Scholar 

  • McInerney, J. O. (1998). Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proceedings of the National Academy of Sciences of the United States of America 95: 10698–10703.

    Google Scholar 

  • Nelson, K. E. R. A. Clayton, S. R. Gill, M. L. Gwinn, R. J. Dodson, D. H. Haft, E. K. Hickey, J. D. Peterson, W. C. Nelson, K. A. Ketchum, L. Mcdonald, T. R. Utterback, J. A. Malek, K. D. Linher, M. M. Garrett, A. M. Stewart, M. D. Cotton, M. S. Pratt, C. A. Phillips, D. Richardson, J. Heidelberg, G. G. Sutton, R. D. Fleischmann, J. A. Eisen, O. White, S. L. Salzberg, H. O. Smith, J. C. Venter and C. M. Fraser (1999). Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399: 323–329.

    Google Scholar 

  • Pevzner, P., M. Borodovsky and A. Mironov (1989). Linguistics of nucleotide sequences. I: The significance of deviations from mean statistical characteristics and prediction of the frequencies of occurrence of words. Journal of Biomolecular Structure and Dynamics 6: 1013–1026.

    Google Scholar 

  • Pietrokovski, S. (1994). Comparing nucleotide and protein sequences by linguistic methods. Journal of Biotechnology 35: 257–272.

    Google Scholar 

  • Pietrokovski, S., J. Hirshonn and E. N. Trifonov (1990). Linguistic Measure of Taxonomic and Functional Relatedness of Nucleotide Sequences. Journal of Biomolecular Structure and Dynamics 7: 1251–1268.

    Google Scholar 

  • Preparata, F., A. Frieze and E. Upfal (1999). Optimal reconstruction of a sequence from its probes. Journal of Computational Biology 6: 361–368.

    Google Scholar 

  • Reinert, G., S. Schbath and M. S. Waterman (2000). Probabilistic and statistical properties of words: an overview. Journal of Computational Biology 7: 1–46.

    Google Scholar 

  • Sandberg, R., G. Winberg, C-I. Branden, A. Kaske, I. Ernberg and J. Coster (2001). Capturing Whole-Genome Characteristics in Short Sequences Using a Naïve Bayesian Classifier. Genome Research 11: 1404–1409.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Institute of Evolution, University of Haifa, Mount Carmel, Haifa, 31905, Israel

    Valery Kirzhner, Eviatar Nevo, Abraham Korol & Alexander Bolshoy

Authors
  1. Valery Kirzhner

    You can also search for this author inPubMed Google Scholar

  2. Eviatar Nevo

    You can also search for this author inPubMed Google Scholar

  3. Abraham Korol

    You can also search for this author inPubMed Google Scholar

  4. Alexander Bolshoy

    You can also search for this author inPubMed Google Scholar

Rights and permissions

About this article

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp