- Article
- Published:
Random sequences are an abundant source of bioactive RNAs or peptides
- Rafik Neme1 nAff2,
- Cristina Amador1 nAff2,
- Burcin Yildirim1,
- Ellen McConnell1 &
- …
- Diethard Tautz ORCID:orcid.org/0000-0002-0460-53441
Nature Ecology & Evolutionvolume 1, Article number: 0127 (2017)Cite this article
10kAccesses
174Altmetric
Abstract
It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have an initial function that influences the fitness of an organism. Here, we have tested this question systematically, by expressing clones with random sequences inEscherichia coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.
This is a preview of subscription content,access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
¥14,900 per year
only ¥1,242 per issue
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
References
Jacob, F. Evolution and tinkering.Science196, 1161–1166 (1977).
Tautz, D. The discovery ofde novo gene evolution.Perspect. Biol. Med.57, 149–161 (2014).
Chothia, C. Proteins. One thousand families for the molecular biologist.Nature357, 543–544 (1992).
Lupas, A. N., Ponting, C. P. & Russell, R. B. On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world?J. Struct. Biol.134, 191–203 (2001).
Orengo, C. A. & Thornton, J. M. Protein families and their evolution—a structural perspective.Annu. Rev. Biochem.74, 867–900 (2005).
Carvunis, A. R. et al. Proto-genes andde novo gene birth.Nature487, 370–374 (2012).
Reinhardt, J. A. et al.De novo ORFs inDrosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences.PLoS Genet.9, e1003860 (2013).
Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread ofde novo genes inDrosophila melanogaster populations.Science343, 769–772 (2014).
Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA tode novo gene emergence.eLife5, e09977 (2016).
Tautz, D. & Domazet-Loso, T. The evolutionary origin of orphan genes.Nat. Rev. Genet.12, 692–702 (2011).
Xie, C. et al. Hominoid-specificde novo protein-coding genes originating from long non-coding RNAs.PLoS Genet.8, e1002942 (2012).
Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides.Elife3, e03523 (2014).
Barrick, J. E. & Lenski, R. E. Genome dynamics during experimental evolution.Nat. Rev. Genet.14, 827–839 (2013).
Stepanov, V. G. & Fox, G. E. Stress-drivenin vivo selection of a functional mini-gene from a randomized DNA library expressing combinatorial peptides inEscherichia coli.Mol. Biol. Evol.24, 1480–1491 (2007).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.Genome Biol.15, 550 (2014).
Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library.Nature410, 715–718 (2001).
Uversky, V. N. & Dunker, A. K. Understanding protein non-folding.BBA-Proteins Proteom.1804, 1231–1264 (2010).
Tompa, P., Schad, E., Tantos, A. & Kalmar, L. Intrinsically disordered proteins: emerging interaction specialists.Curr. Opin. Struct. Biol.35, 49–59 (2015).
Cumberworth, A., Lamour, G., Babu, M. M. & Gsponer, J. Promiscuity as a functional trait: intrinsically disordered regions as central players of interactomes.Biochem. J.454, 361–369 (2013).
Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the molecular biologist.Mol. Cell55, 161–169 (2014).
Sims, D. et al. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing.Genome Biol.12, R104 (2011).
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST.Bioinformatics26, 2460–2461 (2010).
Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European molecular biology open software suite.Trends Genet.16, 276–277 (2000).
Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate read mapping in highly polymorphic genomes.Bioinformatics29, 2790–2791 (2013).
Li, H. et al. The sequence alignment/map format and SAMtools.Bioinformatics25, 2078–2079 (2009).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND.Nat. Methods12, 59–60 (2015).
Xiao, N., Cao, D. S., Zhu, M. F. & Xu, Q. S . protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences.Bioinformatics31, 1857–1859 (2015).
Acknowledgements
We thank S. Künzel for sequencing and E. Özkurt for contributions during her rotation project. The project was financed through an ERC advanced grant to D.T. (NewGenes—322564).
Author information
Rafik Neme & Cristina Amador
Present address: †Present addresses: Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, 1212 Amsterdam Avenue, New York, NY 10027, USA (R.N.); Technical University of Denmark, Department of Biotechnology and Biomedicine, 2800 Kgs Lyngby, Denmark (C.A.),
Authors and Affiliations
Max-Planck Institute for Evolutionary Biology, August-Thienemannstrasse 2, Plön, 24306, Germany.
Rafik Neme, Cristina Amador, Burcin Yildirim, Ellen McConnell & Diethard Tautz
- Rafik Neme
You can also search for this author inPubMed Google Scholar
- Cristina Amador
You can also search for this author inPubMed Google Scholar
- Burcin Yildirim
You can also search for this author inPubMed Google Scholar
- Ellen McConnell
You can also search for this author inPubMed Google Scholar
- Diethard Tautz
You can also search for this author inPubMed Google Scholar
Contributions
R.N. and D.T. designed the experiment, C.A. constructed the library, C.A., B.Y. and E.M. conducted the experiments, R.N. did the bioinformatic analysis, and R.N. and D.T. wrote the paper.
Corresponding author
Correspondence toDiethard Tautz.
Ethics declarations
Competing interests
The work described in this publication is subject to patent application by the Max-Planck Society.
Supplementary information
Supplementary Figures
Supplementary Figures 1–3 (PDF 661 kb)
Supplementary Table 1
Supplementary Table 1 (XLSX 116 kb)
Rights and permissions
About this article
Cite this article
Neme, R., Amador, C., Yildirim, B.et al. Random sequences are an abundant source of bioactive RNAs or peptides.Nat Ecol Evol1, 0127 (2017). https://doi.org/10.1038/s41559-017-0127
Received:
Accepted:
Published:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative