Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature
  • Letter
  • Published:

Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals

Naturevolume 458pages223–227 (2009)Cite this article

Abstract

There is growing recognition that mammalian cells produce many thousands of large intergenic transcripts1,2,3,4. However, the functional significance of these transcripts has been particularly controversial. Although there are some well-characterized examples, most (>95%) show little evidence of evolutionary conservation and have been suggested to represent transcriptional noise5,6. Here we report a new approach to identifying large non-coding RNAs using chromatin-state maps to discover discrete transcriptional units intervening known protein-coding loci. Our approach identified1,600 large multi-exonic RNAs across four mouse cell types. In sharp contrast to previous collections, these large intervening non-coding RNAs (lincRNAs) show strong purifying selection in their genomic loci, exonic sequences and promoter regions, with greater than 95% showing clear evolutionary conservation. We also developed a functional genomics approach that assigns putative functions to each lincRNA, demonstrating a diverse range of roles for lincRNAs in processes from embryonic stem cell pluripotency to cell proliferation. We obtained independent functional validation for the predictions for over 100 lincRNAs, using cell-based assays. In particular, we demonstrate that specific lincRNAs are transcriptionally regulated by key transcription factors in these processes such as p53, NFκB, Sox2, Oct4 (also known as Pou5f1) and Nanog. Together, these results define a unique collection of functional lincRNAs that are highly conserved and implicated in diverse biological processes.

This is a preview of subscription content,access via your institution

Access options

Access through your institution

Subscription info for Japanese customers

We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Figure 1:Intergenic K4–K36 domains produce multi-exonic RNAs.
Figure 2:lincRNA K4–K36 domains do not encode proteins and are conserved in their exons and promoters.
Figure 3:lincRNAs show strong associations with other lincRNAs and with several biological processes.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

Microarray data have been deposited in the Gene Expression Omnibus (GEO) under accession numberGSE13765.

References

  1. Bertone, P. et al. Global identification of human transcribed sequences with genome tiling arrays.Science306, 2242–2246 (2004)

    Article ADS CAS  Google Scholar 

  2. Carninci, P. et al. The transcriptional landscape of the mammalian genome.Science309, 1559–1563 (2005)

    Article ADS CAS  Google Scholar 

  3. Kapranov, P. et al. Large-scale transcriptional activity in chromosomes 21 and 22.Science296, 916–919 (2002)

    Article ADS CAS  Google Scholar 

  4. Rinn, J. L. et al. The transcriptional activity of human chromosome 22.Genes Dev.17, 529–540 (2003)

    Article CAS  Google Scholar 

  5. Ponjavic, J., Ponting, C. P. & Lunter, G. Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs.Genome Res.17, 556–565 (2007)

    Article CAS  Google Scholar 

  6. Struhl, K. Transcriptional noise and the fidelity of initiation by RNA polymerase II.Nature Struct. Mol. Biol.14, 103–105 (2007)

    Article CAS  Google Scholar 

  7. Brannan, C. I., Dees, E. C., Ingram, R. S. & Tilghman, S. M. The product of theH19 gene may function as an RNA.Mol. Cell. Biol.10, 28–36 (1990)

    Article CAS  Google Scholar 

  8. Brown, C. J. et al. A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome.Nature349, 38–44 (1991)

    Article ADS CAS  Google Scholar 

  9. Lee, J. T., Davidow, L. S. & Warshawsky, D.Tsix, a gene antisense toXist at the X-inactivation centre.Nature Genet.21, 400–404 (1999)

    Article CAS  Google Scholar 

  10. Sotomaru, Y. et al. Unregulated expression of the imprinted genesH19 andIgf2r in mouse uniparental fetuses.J. Biol. Chem.277, 12474–12478 (2002)

    Article CAS  Google Scholar 

  11. Rinn, J. L. et al. Functional demarcation of active and silent chromatin domains in humanHOX loci by noncoding RNAs.Cell129, 1311–1323 (2007)

    Article CAS  Google Scholar 

  12. Willingham, A. T. et al. A strategy for probing the function of noncoding RNAs finds a repressor of NFAT.Science309, 1570–1573 (2005)

    Article ADS CAS  Google Scholar 

  13. Wang, J. et al. Mouse transcriptome: neutral evolution of ‘non-coding’ complementary DNAs.Nature431 1–2 10.1038/nature03016 (2004)

    Article CAS PubMed  Google Scholar 

  14. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells.Nature448, 553–560 (2007)

    Article ADS CAS  Google Scholar 

  15. Griffiths-Jones, S., Grocock, R. J., van Dongen, S., Bateman, A. & Enright, A. J. miRBase: microRNA sequences, targets and gene nomenclature.Nucleic Acids Res.34, D140–D144 (2006)

    Article CAS  Google Scholar 

  16. Tam, O. H. et al. Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes.Nature453, 534–538 (2008)

    Article ADS CAS  Google Scholar 

  17. Watanabe, T. et al. Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes.Nature453, 539–543 (2008)

    Article ADS CAS  Google Scholar 

  18. Clamp, M. et al. Distinguishing protein-coding and noncoding genes in the human genome.Proc. Natl Acad. Sci. USA104, 19428–19433 (2007)

    Article ADS CAS  Google Scholar 

  19. Lin, M. F. et al. Revisiting the protein-coding gene catalog ofDrosophila melanogaster using 12 fly genomes.Genome Res.17, 1823–1836 (2007)

    Article CAS  Google Scholar 

  20. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.Genome Res.15, 1034–1050 (2005)

    Article CAS  Google Scholar 

  21. Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution.Nature Genet.38, 626–635 (2006)

    Article CAS  Google Scholar 

  22. Su, A. I. et al. Large-scale analysis of the human and mouse transcriptomes.Proc. Natl Acad. Sci. USA99, 4465–4470 (2002)

    Article ADS CAS  Google Scholar 

  23. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.Proc. Natl Acad. Sci. USA102, 15545–15550 (2005)

    Article ADS CAS  Google Scholar 

  24. Tanay, A., Sharan, R. & Shamir, R. Discovering statistically significant biclusters in gene expression data.Bioinformatics18 (Suppl 1). S136–S144 (2002)

    Article  Google Scholar 

  25. Chang, H. Y. et al. Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival.Proc. Natl Acad. Sci. USA102, 3738–3743 (2005)

    Article ADS CAS  Google Scholar 

  26. Carrio, M., Arderiu, G., Myers, C. & Boudreau, N. J. Homeobox D10 induces phenotypic reversion of breast tumor cells in a three-dimensional culture model.Cancer Res.65, 7177–7185 (2005)

    Article CAS  Google Scholar 

  27. Ventura, A. et al. Cre-lox-regulated conditional RNA interference from transgenes.Proc. Natl Acad. Sci. USA101, 10380–10385 (2004)

    Article ADS CAS  Google Scholar 

  28. Loh, Y. H. et al. The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells.Nature Genet.38, 431–440 (2006)

    Article CAS  Google Scholar 

  29. Ivanova, N. et al. Dissecting self-renewal in stem cells with RNA interference.Nature442, 533–538 (2006)

    Article ADS CAS  Google Scholar 

  30. Zhao, J., Sun, B. K., Erwin, J. A., Song, J. J. & Lee, J. T. Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome.Science322, 750–756 (2008)

    Article ADS CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank our colleagues at the Broad Institute, especially J. P. Mesirov for discussions and statistical insights, X. Xie for statistical help with conservation analyses, J. Robinson for visualization help, M. Ku, E. Mendenhall and X. Zhang for help generating ChIP samples, and N. Novershtern and A. Levy for providing transcription factor lists. M. Guttman is a Vertex scholar, I.A. acknowledges the support of the Human Frontier Science Program Organization. This work was funded by Beth Israel Deaconess Medical Center, National Human Genome Research Institute, and the Broad Institute of MIT and Harvard.

Author Contributions J.L.R., E.S.L., A.R. and M. Guttman conceived and designed experiments. The manuscript was written by M. Guttman, A.R., J.L.R. and E.S.L. J.L.R., I.A., C.F., D.F., M.H., B.W.C., J.P.C. and M. Guttman performed molecular biology experiments. All data analyses were performed by M. Guttman in conjunction with M. Garber (conservation analyses), M.F.L. (codon substitution frequency), T.S.M. (ChlP-seq data), O.Z. (motif analysis) and M.N.C. (lincRNA genomic location analysis). Reagents were provided by M. Garber (pre-published conservation analysis tools); T.J. and D.F. (p53 wild-type and knockout MEFs); N.H., A.R. and I.A. (dendritic cell stimulated time course); B.E.B. (ChlP data); R.J., B.W.C. and J.P.C. (luciferase assays); and M.K. and M.F.L. (codon substitution frequency code).

Author information

Author notes
  1. John L. Rinn and Eric S. Lander: These authors contributed equally to this work.

Authors and Affiliations

  1. Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA,

    Mitchell Guttman, Ido Amit, Manuel Garber, Courtney French, Michael F. Lin, Maite Huarte, Or Zuk, Tarjei S. Mikkelsen, Nir Hacohen, Bradley E. Bernstein, Manolis Kellis, Aviv Regev, John L. Rinn & Eric S. Lander

  2. Department of Biology,,

    Mitchell Guttman, Bryce W. Carey, John P. Cassady, Rudolf Jaenisch, Tyler Jacks, Aviv Regev & Eric S. Lander

  3. The Koch Institute for Integrative Cancer Research,,

    David Feldser & Tyler Jacks

  4. Division of Health Sciences and Technology, and,

    Tarjei S. Mikkelsen

  5. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA,

    Manolis Kellis

  6. Department of Pathology, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, USA,

    Maite Huarte & John L. Rinn

  7. Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02114, USA,

    Moran N. Cabili & Eric S. Lander

  8. Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, Massachusetts 02142, USA,

    Bryce W. Carey, John P. Cassady, Rudolf Jaenisch & Eric S. Lander

  9. Center for Immunology and Inflammatory Diseases,,

    Nir Hacohen

  10. Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA,

    Bradley E. Bernstein

  11. Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA,

    Bradley E. Bernstein & John L. Rinn

Authors
  1. Mitchell Guttman

    You can also search for this author inPubMed Google Scholar

  2. Ido Amit

    You can also search for this author inPubMed Google Scholar

  3. Manuel Garber

    You can also search for this author inPubMed Google Scholar

  4. Courtney French

    You can also search for this author inPubMed Google Scholar

  5. Michael F. Lin

    You can also search for this author inPubMed Google Scholar

  6. David Feldser

    You can also search for this author inPubMed Google Scholar

  7. Maite Huarte

    You can also search for this author inPubMed Google Scholar

  8. Or Zuk

    You can also search for this author inPubMed Google Scholar

  9. Bryce W. Carey

    You can also search for this author inPubMed Google Scholar

  10. John P. Cassady

    You can also search for this author inPubMed Google Scholar

  11. Moran N. Cabili

    You can also search for this author inPubMed Google Scholar

  12. Rudolf Jaenisch

    You can also search for this author inPubMed Google Scholar

  13. Tarjei S. Mikkelsen

    You can also search for this author inPubMed Google Scholar

  14. Tyler Jacks

    You can also search for this author inPubMed Google Scholar

  15. Nir Hacohen

    You can also search for this author inPubMed Google Scholar

  16. Bradley E. Bernstein

    You can also search for this author inPubMed Google Scholar

  17. Manolis Kellis

    You can also search for this author inPubMed Google Scholar

  18. Aviv Regev

    You can also search for this author inPubMed Google Scholar

  19. John L. Rinn

    You can also search for this author inPubMed Google Scholar

  20. Eric S. Lander

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toJohn L. Rinn.

Supplementary information

Supplementary Figures

This file contains Supplementary Figures 1-11 with Legends (PDF 2081 kb)

Supplementary Information

This file contains Supplementary Methods and Supplementary References (PDF 147 kb)

Supplementary Table 1

In Supplementary Table 1 the K4-K36 domain coordinates are shown and the K4-K36 enriched domains in the 4 mouse cell types are listed. Coordinates are indicated in mouse genome build MM8. (XLS 107 kb)

Supplementary Table 2

In Supplementary Table 2 the lincRNA Exon Coordinates and Pi LOD Enrichment Score are shown. lincRNA exons defined by Nimbelegen tiling micorarrays are listed in mouse genome build MM9. Each exon has an associated Pi LOD Enrichment Score (Methods) reported. (XLS 174 kb)

Supplementary Table 3

In Supplementary Table 3 the characteristic properties of lincRNAs are shown. (DOC 36 kb)

Supplementary Table 4

In Supplementary Table 4 the PCR validation primer sequences are shown. Primer sequences used for validation of lincRNA expression by PCR and qPCR are reported. (XLS 31 kb)

Supplementary Table 5

In Supplementary Table 5 the Northern blot analysis probe sequences and primers are shown. Primers and amplicons for Northern blot analyses are provided.The correct file for Supplementary Table 5 was uploaded on 4th March, 2009. (XLS 27 kb)

Supplementary Table 6

In Supplementary Table 6 the Codon Substitution Frequency (CSF) Scores are shown. The CSF score for each K4-K36 domain is provided. Coordinates are reported in mouse genome build MM9.An updated version for Suplementary Table 6 was uploaded on 4th March, 2009 (XLS 122 kb)

Supplementary Table 7

In Supplementary Table 7 the Exon conservation for lincRNAs and other annotations are shown. Pi LOD Enrichment scores are provided for lincRNA exons and other annotations compared in the text. The coordinates are provided in Mouse genome MM9 and the max 12-mer LOD score as well as the randomized average max 12-mer LOD score is indicated along with the enrichment score. (XLS 836 kb)

Supplementary Table 8

In Supplementary Table 8 the lincRNA Promoter Conservation is shown. Pi LOD Enrichment scores are provided for each lincRNA promoter region, protein coding gene promoters, and random intergenic regions. Coordinates are provided in Mouse genome build MM9. (XLS 634 kb)

Supplementary Table 9

In Supplementary Table 9 the Human and Mouse orthologous lincRNAs are shown. lincRNAs defined in Human Lung Fibroblasts were lifted into the mouse genome (MM8) and enrichment statistics were computed for Mouse Lung Fibroblasts (Methods). The enrichment p-values and fold are indicated. (XLS 28 kb)

Supplementary Table 10

In Supplementary Table 10 the lincRNA expression across mouse tissue compendium is shown. lincRNA expression levels across various mouse cell types, tissues, and conditions are provided. The values are log values of the relative expression of each lincRNA. (XLS 420 kb)

Supplementary Table 11

In Supplementary Table 11 the Gene Set Enrichment Analysis (GSEA) association matrix is shown. Functional associations between lincRNAs (columns) and MSigDB terms (rows) are indicated. Positive association is indicated by a 1, negative association is indicated by an -1, and no association is indicated by a 0. (TXT 6203 kb)

Supplementary Table 12

In Supplementary Table 12 the P53 regulated lincRNAs upon DNA Damage Induction are shown. lincRNAs that temporally increase inP53 wild-type cells compared with P53 Knock-out cells upon stimulation with DNA damage are indicated along with their expression levels across the DNA damage time course. (XLS 26 kb)

Supplementary Table 13

In Supplementary Table 13 the P53 Motif Enrichments in induced lincRNAs are shown. P53 motif scores are provided for each lincRNA promoter along with the sequence of the best motif hit and its conservation. P53 induced lincRNAs are indicated in the last column. (XLS 347 kb)

Supplementary Table 14

In Supplementary Table 14 the NFKB regulated lincRNAs are shown. lincRNAs that are differentially expressed in TLR4 stimulation of BMDC cells compared with unstimulated BMDC cells are provided. (XLS 23 kb)

Supplementary Table 15

In Supplementary Table 15 the ES cells lincRNAs bound by Oct4 and/or Nanog are shown: The coordinates of the lincRNAs bound by Oct4/Nanog in ES cells is provided. (XLS 17 kb)

Supplementary Table 16

In Supplementary Table 16 the functional association of lincENC1 is shown. GSEA results for lincENC1 is provided for both profiled exons in the transcript. (XLS 23 kb)

Supplementary Table 17

In Supplementary Table 17the Enrichment of Gene Ontology (GO) terms for lincRNA neighbors is shown. Significant GO terms (FDR<.05) are indicated along with their associated p-values. (XLS 22 kb)

Rights and permissions

About this article

Cite this article

Guttman, M., Amit, I., Garber, M.et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals.Nature458, 223–227 (2009). https://doi.org/10.1038/nature07672

Download citation

Access through your institution
Buy or subscribe

Editorial Summary

Large RNAs: conserved for a purpose

Mammalian genomes are transcribed to produce numerous large non-coding RNAs, but their function is unclear, primarily because these transcripts show little or no evidence of evolutionary conservation. A new approach to characterizing these mysterious molecules has now moved the field on. Rather than targeting the RNA molecules themselves, their existence was revealed as chromatin modifications or epigenomic marks in the DNA of four mouse cell types. The search yielded over a thousand large multi-exonic transcriptional units that do not overlap known protein-coding loci and are highly conserved. Possible functions could be assigned to each of these large intervening non-coding RNAs (or lincRNAs), ranging from embryonic stem cell pluripotency to cell proliferation. Specific lincRNAs turn out to be regulated by transcription factors that are key in these processes including p53, NFκB, Sox2, Oct4, and Nanog — and most of these lincRNAs are conserved across mammals.

Associated content

Collection

Methods for studying non-coding RNA

Advertisement

Search

Advanced search

Quick links

Nature Briefing

Sign up for theNature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox.Sign up for Nature Briefing

[8]ページ先頭

©2009-2025 Movatter.jp