Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature
  • Article
  • Published:

Mass-spectrometry-based draft of theArabidopsis proteome

Naturevolume 579pages409–414 (2020)Cite this article

Subjects

Abstract

Plants are essential for life and are extremely diverse organisms with unique molecular capabilities1. Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plantArabidopsis thaliana. Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where they are expressed, in which approximate quantities (a dynamic range of more than six orders of magnitude) and to what extent they are phosphorylated (over 43,000 sites). We present examples of how the data may be used, such as to discover proteins that are translated from short open-reading frames, to uncover sequence motifs that are involved in the regulation of protein production, and to identify tissue-specific protein complexes or phosphorylation-mediated signalling events. Interactive access to this resource for the plant community is provided by the ProteomicsDB and ATHENA databases, which include powerful bioinformatics tools to explore and characterizeArabidopsis proteins, their modifications and interactions.

This is a preview of subscription content,access via your institution

Access options

Access through your institution

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

9,800 Yen / 30 days

cancel any time

Subscription info for Japanese customers

We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Tissue map and multi-omics dataset.
Fig. 2: Data exploration in ATHENA and ProteomicsDB.
Fig. 3: Protein and mRNA expression.
Fig. 4: Characterization of protein complexes by protein co-expression and SEC–MS.
Fig. 5: Ascribing function to protein phosphorylation.

Similar content being viewed by others

Data availability

The data supporting the findings of this study are available within the paper, the Supplementary Information and the public repositories. Source Data for Figs.15 and Extended Data Figs.19 are included with the paper. Transcriptome sequencing and quantification data are available at ArrayExpress (www.ebi.ac.uk/arrayexpress) under the identifierE-MTAB-7978. The raw mass spectrometric data and MaxQuant result files have been deposited to the ProteomeXchange Consortium via PRIDE122, with the dataset identifier PXD013868.

References

  1. Krämer, U. Planting molecular functions in an ecological context withArabidopsisthaliana.eLife4, (2015).

  2. Peng, J. et al. ‘Green revolution’ genes encode mutant gibberellin response modulators.Nature400, 256–261 (1999).

    Article ADS CAS PubMed  Google Scholar 

  3. The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plantArabidopsis thaliana.Nature408, 796–815 (2000).

    Article ADS  Google Scholar 

  4. Kawakatsu, T. et al. Epigenomic diversity in a global collection ofArabidopsis thaliana accessions.Cell166, 492–505 (2016).

    Article CAS PubMed PubMed Central  Google Scholar 

  5. Cheng, C. Y. et al. Araport11: a complete reannotation of theArabidopsis thaliana reference genome.Plant J.89, 789–804 (2017).

    Article CAS PubMed  Google Scholar 

  6. The UniProt Consortium. UniProt: the universal protein knowledgebase.Nucleic Acids Res.45 (D1), D158–D169 (2017).

    Article CAS  Google Scholar 

  7. Baerenfaller, K. et al. Genome-scale proteomics revealsArabidopsisthaliana gene models and proteome dynamics.Science320, 938–941 (2008).

    Article ADS CAS PubMed  Google Scholar 

  8. van Wijk, K. J., Friso, G., Walther, D. & Schulze, W. X. Meta-analysis ofArabidopsisthaliana phospho-proteomics data reveals compartmentalization of phosphorylation motifs.Plant Cell26, 2367–2389 (2014).

    Article CAS PubMed PubMed Central  Google Scholar 

  9. Durek, P. et al. PhosPhAt: theArabidopsisthaliana phosphorylation site database. An update.Nucleic Acids Res.38, D828–D834 (2010).

    Article CAS PubMed  Google Scholar 

  10. Sharma, K. et al. Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling.Cell Rep.8, 1583–1594 (2014).

    Article CAS PubMed  Google Scholar 

  11. Schmidt, T. et al. ProteomicsDB.Nucleic Acids Res.46 (D1), D1271–D1281 (2018).

    Article CAS PubMed  Google Scholar 

  12. Gessulat, S. et al. Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.Nat. Methods16, 509–518 (2019).

    Article CAS PubMed  Google Scholar 

  13. Bienvenut, W. V. et al. Comparative large scale characterization of plant versus mammal proteins reveals similar and idiosyncraticN-α-acetylation features.Mol. Cell. Proteomics11, mcp.M111.015131 (2012).

  14. Hazarika, R. R. et al. ARA-PEPs: a repository of putative sORF-encoded peptides inArabidopsis thaliana.BMC Bioinformatics18, 37 (2017).

    Article CAS PubMed PubMed Central  Google Scholar 

  15. Wilhelm, M. et al. Mass-spectrometry-based draft of the human proteome.Nature509, 582–587 (2014).

    Article ADS CAS PubMed  Google Scholar 

  16. Zheng, Y. et al. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases.Mol. Plant9, 1667–1670 (2016).

    Article CAS PubMed  Google Scholar 

  17. Yang, M. et al. A comprehensive analysis of protein phosphatases in rice andArabidopsis.Plant Syst. Evol.289, 111–126 (2010).

    Article CAS  Google Scholar 

  18. Litt, A. & Kramer, E. M. The ABC model and the diversification of floral organ identity.Semin. Cell Dev. Biol.21, 129–137 (2010).

    Article CAS PubMed  Google Scholar 

  19. Bar-On, Y. M. & Milo, R. The global mass and average rate of rubisco.Proc. Natl Acad. Sci. USA116, 4738–4743 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  20. Gupta, R. et al. Time to dig deep into the plant proteome: a hunt for low-abundance proteins.Front Plant Sci6, 22 (2015).

    PubMed PubMed Central  Google Scholar 

  21. Galván-Ampudia, C. S. & Offringa, R. Plant evolution: AGC kinases tell the auxin tale.Trends Plant Sci.12, 541–547 (2007).

    Article CAS PubMed  Google Scholar 

  22. Zhang, Y., He, J. & McCormick, S. TwoArabidopsis AGC kinases are critical for the polarized growth of pollen tubes.Plant J.58, 474–484 (2009).

    Article CAS PubMed  Google Scholar 

  23. Eraslan, B. et al. Quantification and discovery of sequence determinants of protein-per-mRNA amount in 29 human tissues.Mol. Syst. Biol.15, e8513 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  24. Liu, Y., Beyer, A. & Aebersold, R. On the dependency of cellular protein levels on mRNA abundance.Cell165, 535–550 (2016).

    Article CAS PubMed  Google Scholar 

  25. Hanson, G. & Coller, J. Codon optimality, bias and usage in translation and mRNA decay.Nat. Rev. Mol. Cell Biol.19, 20–30 (2018).

    Article CAS PubMed  Google Scholar 

  26. Schwanhäusser, B. et al. Global quantification of mammalian gene expression control.Nature473, 337–342 (2011).

    Article ADS CAS PubMed  Google Scholar 

  27. Santner, A. & Estelle, M. The ubiquitin-proteasome system regulates plant hormone signaling.Plant J.61, 1029–1040 (2010).

    Article CAS PubMed PubMed Central  Google Scholar 

  28. Luo, J., Zhou, J. J. & Zhang, J. Z. Aux/IAA gene family in plants: molecular structure, regulation, and function.Int. J. Mol. Sci.19, E259 (2018).

    Article CAS PubMed  Google Scholar 

  29. Bai, B. et al. Seed stored mRNAs that are specifically associated to monosome are translationally regulated during germination.Plant Physiol.182, 378–392 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  30. Szklarczyk, D. et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.Nucleic Acids Res.45 (D1), D362–D368 (2017).

    Article CAS PubMed  Google Scholar 

  31. Wang, Y., Tan, X. & Paterson, A. H. Different patterns of gene structure divergence following gene duplication inArabidopsis.BMC Genomics14, 652 (2013).

    Article CAS PubMed PubMed Central  Google Scholar 

  32. Lloyd, J. & Meinke, D. A comprehensive dataset of genes with a loss-of-function mutant phenotype inArabidopsis.Plant Physiol.158, 1115–1129 (2012).

    Article CAS PubMed PubMed Central  Google Scholar 

  33. Brandão, M. M., Dantas, L. L. & Silva-Filho, M. C. AtPIN:Arabidopsisthaliana protein interaction network.BMC Bioinformatics10, 454 (2009).

    Article CAS PubMed PubMed Central  Google Scholar 

  34. Kristensen, A. R., Gsponer, J. & Foster, L. J. A high-throughput approach for measuring temporal changes in the interactome.Nat. Methods9, 907–909 (2012).

    Article CAS PubMed PubMed Central  Google Scholar 

  35. Schwartz, D. & Gygi, S. P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets.Nat. Biotechnol.23, 1391–1398 (2005).

    Article CAS PubMed  Google Scholar 

  36. Villén, J., Beausoleil, S. A., Gerber, S. A. & Gygi, S. P. Large-scale phosphorylation analysis of mouse liver.Proc. Natl Acad. Sci. USA104, 1488–1493 (2007).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  37. Battaglia, M., Olvera-Carrillo, Y., Garciarrubio, A., Campos, F. & Covarrubias, A. A. The enigmatic LEA proteins and other hydrophilins.Plant Physiol.148, 6–24 (2008).

    Article CAS PubMed PubMed Central  Google Scholar 

  38. Bah, A. et al. Folding of an intrinsically disordered protein by phosphorylation as a regulatory switch.Nature519, 106–109 (2015).

    Article ADS CAS PubMed  Google Scholar 

  39. Mitra, S. K. et al. An autophosphorylation site database for leucine-rich repeat receptor-like kinases inArabidopsis thaliana.Plant J.82, 1042–1060 (2015).

    Article CAS PubMed  Google Scholar 

  40. Landry, C. R., Levy, E. D. & Michnick, S. W. Weak functional constraints on phosphoproteomes.Trends Genet.25, 193–197 (2009).

    Article CAS PubMed  Google Scholar 

  41. Hauser, F., Li, Z., Waadt, R. & Schroeder, J. I. SnapShot: abscisic acid signaling.Cell171, 1708–1708 (2017).

    Article CAS PubMed PubMed Central  Google Scholar 

  42. Vaddepalli, P. et al. The C2-domain protein QUIRKY and the receptor-like kinase STRUBBELIG localize to plasmodesmata and mediate tissue morphogenesis inArabidopsis thaliana.Development141, 4139–4148 (2014).

    Article CAS PubMed  Google Scholar 

  43. Fulton, L. et al. DETORQUEO, QUIRKY, and ZERZAUST represent novel components involved in organ development mediated by the receptor-like kinase STRUBBELIG inArabidopsisthaliana.PLoS Genet.5, e1000355 (2009).

    Article CAS PubMed PubMed Central  Google Scholar 

  44. Smyth, D. R., Bowman, J. L. & Meyerowitz, E. M. Early flower development inArabidopsis.Plant Cell2, 755–767 (1990).

    CAS PubMed PubMed Central  Google Scholar 

  45. Johnson-Brousseau, S. A. & McCormick, S. A compendium of methods useful for characterizingArabidopsis pollen mutants and gametophytically-expressed genes.Plant J.39, 761–775 (2004).

    Article CAS PubMed  Google Scholar 

  46. Sprunck, S. et al. Egg cell-secreted EC1 triggers sperm cell activation during double fertilization.Science338, 1093–1097 (2012).

    Article ADS CAS PubMed  Google Scholar 

  47. Karimi, M., Inzé, D. & Depicker, A. GATEWAY vectors forAgrobacterium-mediated plant transformation.Trends Plant Sci.7, 193–195 (2002).

    Article CAS PubMed  Google Scholar 

  48. Clough, S. J. & Bent, A. F. Floral dip: a simplified method forAgrobacterium-mediated transformation ofArabidopsis thaliana.Plant J.16, 735–743 (1998).

    Article CAS PubMed  Google Scholar 

  49. Schmid, M. et al. A gene expression map ofArabidopsis thaliana development.Nat. Genet.37, 501–506 (2005).

    Article MathSciNet CAS PubMed  Google Scholar 

  50. Boyes, D. C. et al. Growth stage-based phenotypic analysis ofArabidopsis: a model for high throughput functional genomics in plants.Plant Cell13, 1499–1510 (2001).

    CAS PubMed PubMed Central  Google Scholar 

  51. Bowman, J. L. Arabidopsis: an Atlas of Morphology and Development (Springer-Verlag, 1994).

  52. Bradford, M. M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding.Anal. Biochem.72, 248–254 (1976).

    Article CAS PubMed  Google Scholar 

  53. Ruprecht, B. et al. Optimized enrichment of phosphoproteomes by Fe-IMAC column chromatography.Methods Mol. Biol.1550, 47–60 (2017).

    Article CAS PubMed  Google Scholar 

  54. Marx, H. et al. A large synthetic peptide and phosphopeptide reference library for mass spectrometry-based proteomics.Nat. Biotechnol.31, 557–564 (2013).

    Article CAS PubMed  Google Scholar 

  55. Ruprecht, B., Zecha, J., Zolg, D. P. & Kuster, B. High pH reversed-phase micro-columns for simple, sensitive, and efficient fractionation of proteome and (TMT labeled) phosphoproteome digests.Methods Mol. Biol.1550, 83–98 (2017).

    Article CAS PubMed  Google Scholar 

  56. Smith, P. K. et al. Measurement of protein using bicinchoninic acid.Anal. Biochem.150, 76–85 (1985).

    Article CAS PubMed  Google Scholar 

  57. Zolg, D. P. et al. PROCAL: a set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration.Proteomics17, (2017).

  58. Hahne, H. et al. DMSO enhances electrospray response, boosting sensitivity of proteomic experiments.Nat. Methods10, 989–991 (2013).

    Article CAS PubMed  Google Scholar 

  59. Bian, Y. et al. Robust, reproducible and quantitative analysis of thousands of proteomes by micro-flow LC-MS/MS.Nat. Commun.11, 157 (2020).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  60. Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.Nat. Protocols11, 2301–2319 (2016).

    Article CAS PubMed  Google Scholar 

  61. Hanada, K. et al. sORF finder: a program package to identify small open reading frames with high coding potential.Bioinformatics26, 399–400 (2010).

    Article ADS CAS PubMed  Google Scholar 

  62. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome.Nat. Biotechnol.29, 644–652 (2011).

    Article CAS PubMed PubMed Central  Google Scholar 

  63. Li, W., Jaroszewski, L. & Godzik, A. Clustering of highly homologous sequences to reduce the size of large protein databases.Bioinformatics17, 282–283 (2001).

    Article CAS PubMed  Google Scholar 

  64. Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data.Electrophoresis20, 3551–3567 (1999).

    Article CAS PubMed  Google Scholar 

  65. Franken, H. et al. Thermal proteome profiling for unbiased identification of direct and indirect drug targets using multiplexed quantitative mass spectrometry.Nat. Protocols10, 1567–1593 (2015).

    Article CAS PubMed  Google Scholar 

  66. Toprak, U. H. et al. Conserved peptide fragmentation as a benchmarking tool for mass spectrometers and a discriminating feature for targeted proteomics.Mol. Cell. Proteomics13, 2056–2071 (2014).

    Article CAS PubMed PubMed Central  Google Scholar 

  67. Oñate-Sánchez, L. & Vicente-Carbajosa, J. DNA-free RNA isolation protocols forArabidopsis thaliana, including seeds and siliques.BMC Res. Notes1, 93 (2008).

    Article CAS PubMed PubMed Central  Google Scholar 

  68. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data.Bioinformatics30, 2114–2120 (2014).

    Article CAS PubMed PubMed Central  Google Scholar 

  69. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification.Nat. Biotechnol.34, 525–527 (2016).

    Article CAS PubMed  Google Scholar 

  70. Silva, J. C., Gorenstein, M. V., Li, G. Z., Vissers, J. P. & Geromanos, S. J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition.Mol. Cell. Proteomics5, 144–156 (2006).

    Article CAS PubMed  Google Scholar 

  71. The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources.Nucleic Acids Res.45 (D1), D331–D338 (2017).

    Article CAS  Google Scholar 

  72. Cox, J. & Mann, M. 1D and 2D annotation enrichment: a statistical method integrating quantitative proteomics with complementary high-throughput data.BMC Bioinformatics13 (Suppl. 16), S12 (2012).

    Article CAS PubMed PubMed Central  Google Scholar 

  73. Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data.Nat. Methods13, 731–740 (2016).

    Article CAS PubMed  Google Scholar 

  74. Olsen, J. V. et al. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.Cell127, 635–648 (2006).

    Article CAS PubMed  Google Scholar 

  75. Uhlén, M. et al. Transcriptomics resources of human tissues and organs.Mol. Syst. Biol.12, 862 (2016).

    Article PubMed PubMed Central  Google Scholar 

  76. Rijpkema, A. S., Vandenbussche, M., Koes, R., Heijmans, K. & Gerats, T. Variations on a theme: changes in the floral ABCs in angiosperms.Semin. Cell Dev. Biol.21, 100–107 (2010).

    Article CAS PubMed  Google Scholar 

  77. Heazlewood, J. L., Verboom, R. E., Tonti-Filippini, J., Small, I. & Millar, A. H. SUBA: the Arabidopsis Subcellular Database.Nucleic Acids Res.35, D213–D218 (2007).

    Article CAS PubMed  Google Scholar 

  78. Löytynoja, A. Phylogeny-aware alignment with PRANK.Methods Mol. Biol.1079, 155–170 (2014).

    Article PubMed  Google Scholar 

  79. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.Mol. Biol. Evol.17, 540–552 (2000).

    Article CAS PubMed  Google Scholar 

  80. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood.Mol. Biol. Evol.24, 1586–1591 (2007).

    Article CAS PubMed  Google Scholar 

  81. van der Graaf, A. et al. Rate, spectrum, and evolutionary dynamics of spontaneous epimutations.Proc. Natl Acad. Sci. USA112, 6676–6681 (2015).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  82. Gebert, D., Jehn, J. & Rosenkranz, D. Widespread selection for extremely high and low levels of secondary structure in coding sequences across all domains of life.Open Biol.9, 190020 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  83. Camiolo, S., Melito, S. & Porceddu, A. New insights into the interplay between codon bias determinants in plants.DNA Res.22, 461–470 (2015).

    Article CAS PubMed PubMed Central  Google Scholar 

  84. Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O. & Arnold, F. H. Why highly expressed proteins evolve slowly.Proc. Natl Acad. Sci. USA102, 14338–14343 (2005).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  85. Das, S. & Bansal, M. Variation of gene expression in plants is influenced by gene architecture and structural properties of promoters.PLoS ONE 14, e0212678 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  86. Celaj, A. et al. Quantitative analysis of protein interaction network dynamics in yeast.Mol. Syst. Biol.13, 934 (2017).

    Article CAS PubMed PubMed Central  Google Scholar 

  87. Niederhuth, C. E. et al. Widespread natural variation of DNA methylation within angiosperms.Genome Biol.17, 194 (2016).

    Article CAS PubMed PubMed Central  Google Scholar 

  88. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.Genome Biol.15, 550 (2014).

    Article CAS PubMed PubMed Central  Google Scholar 

  89. Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies.Nat. Genet.44, 821–824 (2012).

    Article CAS PubMed PubMed Central  Google Scholar 

  90. Nakazawa, N. fmsb: functions for medical statistics book with some demographic data. R package v.0.6.3;https://CRAN.R-project.org/package=fmsb (2018).

  91. Zhang, Z. Variable selection with stepwise and best subset approaches.Ann. Transl. Med.4, 136 (2016).

    Article PubMed PubMed Central  Google Scholar 

  92. Tibshirani, R. Regression shrinkage and selection via the lasso.J. R. Stat. Soc. Ser. A Stat. Soc.58, 267–288 (1996).

    MathSciNet MATH  Google Scholar 

  93. R Core Team. R: A language and environment for statistical computing. https://www.R-project.org/ (R Foundation for Statistical Computing, 2014).

  94. Knecht, W.Pilot Willingness to Take Off Into Marginal Weather, Part II: Antecedent Overfitting With Forward Stepwise Logistic Regression. Final Report DOT/FAA/AM-05/15 (Federal Aviation Administration, 2005).

  95. Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent.J. Stat. Softw.33, 1–22 (2010).

    Article PubMed PubMed Central  Google Scholar 

  96. Groemping, U. Relative importance for linear regression in R: the package relaimpo.J. Stat. Softw.17, 1–27 (2007).

    Google Scholar 

  97. Heusel, M. et al. Complex-centric proteome profiling by SEC-SWATH-MS.Mol. Syst. Biol.15, e8438 (2019).

    Article CAS PubMed PubMed Central  Google Scholar 

  98. McBride, Z., Chen, D., Reick, C., Xie, J. & Szymanski, D. B. Global analysis of membrane-associated protein oligomerization using protein correlation profiling.Mol. Cell. Proteomics16, 1972–1989 (2017).

    Article CAS PubMed PubMed Central  Google Scholar 

  99. Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein complexes–2009.Nucleic Acids Res.38, D497–D501 (2010).

    Article CAS PubMed  Google Scholar 

  100. Zhang, B. & Horvath, S. A general framework for weighted gene co-expression network analysis.Stat. Appl. Genet. Mol. Biol.4, Article17 (2005).

  101. Langfelder, P., Zhang, B. & Horvath, S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R.Bioinformatics24, 719–720 (2008).

    Article CAS PubMed  Google Scholar 

  102. Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs.Nucleic Acids Res.45 (D1), D353–D361 (2017).

    Article CAS PubMed  Google Scholar 

  103. Fabregat, A. et al. The Reactome pathway Knowledgebase.Nucleic Acids Res.44 (D1), D481–D487 (2016).

    Article CAS PubMed  Google Scholar 

  104. Hochberg, Y. B. Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing.J. R. Stat. Soc. Ser. A Stat. Soc.57, 289–300 (1995).

    MathSciNet MATH  Google Scholar 

  105. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.Proc. Natl Acad. Sci. USA102, 15545–15550 (2005).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  106. List, M. et al. KeyPathwayMinerWeb: online multi-omics network enrichment.Nucleic Acids Res.44 (W1), W98–W104 (2016).

    Article CAS PubMed PubMed Central  Google Scholar 

  107. Letunic, I. & Bork, P. 20 years of the SMART protein domain annotation resource.Nucleic Acids Res.46 (D1), D493–D496 (2018).

    Article CAS PubMed  Google Scholar 

  108. Wagih, O. ggseqlogo: a versatile R package for drawing sequence logos.Bioinformatics33, 3645–3647 (2017).

    Article CAS PubMed  Google Scholar 

  109. Goel, R., Harsha, H. C., Pandey, A. & Prasad, T. S. Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis.Mol. Biosyst.8, 453–463 (2012).

    Article CAS PubMed  Google Scholar 

  110. Zourelidou, M. et al. The polarly localized D6 PROTEIN KINASE is required for efficient auxin transport inArabidopsis thaliana.Development136, 627–636 (2009).

    Article CAS PubMed  Google Scholar 

  111. Mayer, U. B. G. & Jurgens, G. Apical-basal pattern formation in theArabidopsis embryo: studies on the role of the gnom gene.Development177, 149–162 (1993).

    Google Scholar 

  112. Moes, D., Himmelbach, A., Korte, A., Haberer, G. & Grill, E. Nuclear localization of the mutant protein phosphatase abi1 is required for insensitivity towards ABA responses inArabidopsis.Plant J.54, 806–819 (2008).

    Article CAS PubMed  Google Scholar 

  113. Tischer, S. V. et al. Combinatorial interaction network of abscisic acid receptors and coreceptors fromArabidopsis thaliana.Proc. Natl Acad. Sci. USA114, 10280–10285 (2017).

    Article CAS PubMed PubMed Central  Google Scholar 

  114. Waterhouse, A. et al. SWISS-MODEL: homology modelling of protein structures and complexes.Nucleic Acids Res.46 (W1), W296–W303 (2018).

    Article CAS PubMed PubMed Central  Google Scholar 

  115. Nishimura, N. et al. Structural mechanism of abscisic acid binding and signaling by dimeric PYR1.Science326, 1373–1379 (2009).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  116. Berman, H. M. et al. The Protein Data Bank.Nucleic Acids Res.28, 235–242 (2000).

    Article ADS CAS PubMed PubMed Central  Google Scholar 

  117. Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis.J. Comput. Chem.25, 1605–1612 (2004).

    Article CAS PubMed  Google Scholar 

  118. Box, M. S., Coustham, V., Dean, C. & Mylne, J. S. Protocol: A simple phenol-based method for 96-well extraction of high quality RNA fromArabidopsis.Plant Methods7, 7 (2011).

    Article CAS PubMed PubMed Central  Google Scholar 

  119. Enugutti, B. et al. Regulation of planar growth by theArabidopsis AGC protein kinase UNICORN.Proc. Natl Acad. Sci. USA109, 15060–15065 (2012).

    Article ADS PubMed PubMed Central  Google Scholar 

  120. Koncz, C. & Schell, J. The promoter of TL-DNA gene 5 controls the tissue-specific expression of chimaeric genes carried by a novel type ofAgrobacterium binary vector.Molecular and General Genetics MGG204, 383–396 (1986).

    Article CAS  Google Scholar 

  121. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis.Nat. Methods9, 676–682 (2012).

    Article CAS PubMed  Google Scholar 

  122. Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools.Nucleic Acids Res.44 (D1), D447–D456 (2016).

    Article CAS PubMed  Google Scholar 

  123. Kwok, S. F. et al.Arabidopsis homologs of a c-Jun coactivator are present both in monomeric form and in the COP9 complex, and their abundance is differentially affected by the pleiotropic cop/det/fus mutations.Plant Cell10, 1779–1790 (1998).

    Article CAS PubMed PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the NGS@tum core facility for RNA sequencing, R. Tofanelli for help with imaging the ovules, R. J. Schmitz for providing data access for the feature analysis and M. Reinecke, F. Bayer and S. Galinec for mass spectrometry measurements. This work was in part funded by the German Science Foundation (DFG, SFB924), a research fellowship to H.S. by the Japan Society for the Promotion of Sciences, and a research fellowship to X.C. by the Chinese Research Council.

Author information

Authors and Affiliations

  1. Chair of Proteomics and Bioanalytics, Technical University of Munich (TUM), Freising, Germany

    Julia Mergner, Martin Frejno, Patroklos Samaras, Daniel P. Zolg, Tobias Schmidt, Mathias Wilhelm & Bernhard Kuster

  2. Chair of Experimental Bioinformatics, Technical University of Munich (TUM), Freising, Germany

    Markus List & Jan Baumbach

  3. Chair of Botany, Technical University of Munich (TUM), Freising, Germany

    Michael Papacek & Erwin Grill

  4. Plant Developmental Biology, Technical University of Munich (TUM), Freising, Germany

    Xia Chen, Ajeet Chaudhary & Kay Schneitz

  5. Center for Plant Molecular Biology, University of Tübingen, Tübingen, Germany

    Sandra Richter & Gerd Jürgens

  6. Chair of Plant Systems Biology, Technical University of Munich (TUM), Freising, Germany

    Hiromasa Shikata & Claus Schwechheimer

  7. Devision of Plant Environmental Responses, National Institute for Basic Biology, Okazaki, Japan

    Hiromasa Shikata

  8. Plant Genome and Systems Biology, Helmholtz Center Munich, German Research Center for Environmental Health, Munich-Neuherberg, Germany

    Maxim Messerer, Daniel Lang & Klaus F. X. Mayer

  9. Institute of Network Biology (INET), Helmholtz Center Munich, German Research Center for Environmental Health, Munich-Neuherberg, Germany

    Stefan Altmann & Pascal Falter-Braun

  10. Cell Biology and Plant Biochemistry, University of Regensburg, Regensburg, Germany

    Philipp Cyprys & Stefanie Sprunck

  11. Cellzome GmbH, Heidelberg, Germany

    Toby Mathieson & Marcus Bantscheff

  12. Population Epigenetics and Epigenomics, Technical University of Munich (TUM), Freising, Germany

    Rashmi R. Hazarika & Frank Johannes

  13. Institute of Advanced Study (IAS), Technical University of Munich (TUM), Freising, Germany

    Rashmi R. Hazarika, Frank Johannes & Bernhard Kuster

  14. Chair of Food Chemistry and Molecular Sensory Science, Technical University of Munich (TUM), Freising, Germany

    Corinna Dawid, Andreas Dunkel & Thomas Hofmann

  15. Chair of Microbe-Host Interactions, Ludwigs-Maximilians-University (LMU), Munich, Germany

    Pascal Falter-Braun

  16. Plant Genome Biology, Technical University of Munich (TUM), Freising, Germany

    Klaus F. X. Mayer

  17. Bavarian Biomolecular Mass Spectrometry Center (BayBioMS), Technical University of Munich (TUM), Freising, Germany

    Bernhard Kuster

Authors
  1. Julia Mergner

    You can also search for this author inPubMed Google Scholar

  2. Martin Frejno

    You can also search for this author inPubMed Google Scholar

  3. Markus List

    You can also search for this author inPubMed Google Scholar

  4. Michael Papacek

    You can also search for this author inPubMed Google Scholar

  5. Xia Chen

    You can also search for this author inPubMed Google Scholar

  6. Ajeet Chaudhary

    You can also search for this author inPubMed Google Scholar

  7. Patroklos Samaras

    You can also search for this author inPubMed Google Scholar

  8. Sandra Richter

    You can also search for this author inPubMed Google Scholar

  9. Hiromasa Shikata

    You can also search for this author inPubMed Google Scholar

  10. Maxim Messerer

    You can also search for this author inPubMed Google Scholar

  11. Daniel Lang

    You can also search for this author inPubMed Google Scholar

  12. Stefan Altmann

    You can also search for this author inPubMed Google Scholar

  13. Philipp Cyprys

    You can also search for this author inPubMed Google Scholar

  14. Daniel P. Zolg

    You can also search for this author inPubMed Google Scholar

  15. Toby Mathieson

    You can also search for this author inPubMed Google Scholar

  16. Marcus Bantscheff

    You can also search for this author inPubMed Google Scholar

  17. Rashmi R. Hazarika

    You can also search for this author inPubMed Google Scholar

  18. Tobias Schmidt

    You can also search for this author inPubMed Google Scholar

  19. Corinna Dawid

    You can also search for this author inPubMed Google Scholar

  20. Andreas Dunkel

    You can also search for this author inPubMed Google Scholar

  21. Thomas Hofmann

    You can also search for this author inPubMed Google Scholar

  22. Stefanie Sprunck

    You can also search for this author inPubMed Google Scholar

  23. Pascal Falter-Braun

    You can also search for this author inPubMed Google Scholar

  24. Frank Johannes

    You can also search for this author inPubMed Google Scholar

  25. Klaus F. X. Mayer

    You can also search for this author inPubMed Google Scholar

  26. Gerd Jürgens

    You can also search for this author inPubMed Google Scholar

  27. Mathias Wilhelm

    You can also search for this author inPubMed Google Scholar

  28. Jan Baumbach

    You can also search for this author inPubMed Google Scholar

  29. Erwin Grill

    You can also search for this author inPubMed Google Scholar

  30. Kay Schneitz

    You can also search for this author inPubMed Google Scholar

  31. Claus Schwechheimer

    You can also search for this author inPubMed Google Scholar

  32. Bernhard Kuster

    You can also search for this author inPubMed Google Scholar

Contributions

J.M. performed (phosho)proteomic and transcriptomic experiments under the supervision of B.K. S.R. and H.S. performed AGC kinase experiments in plants under the supervision of G.J. and C.S. M.P., A.C. and X.C. performed phosphomutant analysis under the supervision of E.G. and K.S. P.C. and S.S. generated and provided plant material. J.M., M.F., M.M., D.L., S.A., D.P.Z., T.M., C.D., A.D. and R.R.H. performed data analysis under the supervision of B.K., K.F.X.M., P.F., M.B., T.H. and F.J. M.L., P.S. and T.S. generatedArabidopsis resource databases under supervision of M.W. and J.B. J.M., C.S. and B.K. conceptualized the project and wrote the manuscript. All authors edited the manuscript.

Corresponding author

Correspondence toBernhard Kuster.

Ethics declarations

Competing interests

M.W. and B.K. are founders and shareholders of OmicScouts GmbH and msAId GmbH. They have no operational role in the companies. M.F. and D.P.Z. are founders and shareholders of msAId GmbH. T.M. and M.B. are employees and/or shareholders of Cellzome GmbH. The remaining authors declare no competing interests.

Additional information

Peer review informationNature thanks José Dinneny, Paul Haynes and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Descriptive analysis of the multi-omic tissue atlas.

a, Pairwise global Pearson’s expression correlation analysis of all 30 tissues (n = 1 measurement per tissue) on the transcriptome level (bottom triangle) and proteome level (top triangle) using all identified gene loci. Proteins correlate more strongly between tissues than transcripts. Turquoise squares mark examples for morphologically highly similar tissues. Tissues are coloured as in Fig.1.b, Scatter plots showing highly reproducible abundance measurements for transcript (top) and protein (bottom) in morphologically similar tissues that were marked ina; namely, node (ND) versus internode (IND), leaf distal (LFD) versus leaf proximal (LFP) and root (RT) versus root upper zone (RTUZ).r denotes the Pearson’s correlation coefficient;n denotes the number of transcripts or proteins.c, Percentage of genes encoded by a specific chromosome that were identified at the transcriptome, proteome or phosphoproteome level.d, Percentage of Swiss-Prot and TrEMBL protein database entries as well as protein evidence categories from UniProt that were identified at the transcriptome, proteome or phosphoproteome level. Evidence level: (1) protein evidence; (2) transcript evidence; (3) homology; (4) predicted; and (5) uncertain.e, Comparison of protein identifications between an earlierArabidopsis proteome study7 based on 12 tissues, this study (30 tissues) and the number of protein-coding genes in Araport11.f, iBAQ intensity distribution of proteins identified in this study. Proteins also identified in a previous study7 are projected into the same plot.g, Left, proportion of identified P-sites on S, T or Y residues with highly confident localization of the phosphorylation site within the identified peptide sequence (termed class I P-sites if the localization score is greater than 0.75). Right, distribution of proteins for which phosphorylated S, T or Y residues were identified.h, Left, Venn diagram comparing phosphoprotein datasets from a previous publication8, PhosPhAT4.0 and this study. Right, Venn diagram comparing P-site localization confidence between class I sites identified in this study and the low and high confidence datasets reported in a previous publication8.

Source data

Extended Data Fig. 2 Proteogenomics and dynamic range of transcript and protein expression.

a, Number of identified N-terminal (NT) or C-terminal (CT) peptides of proteins in either unmodified or phosphorylated form.b, Frequency of amino acids following the initiator methionine in N-terminal peptides with (−X) or without (M−X) cleavage of the initiator methionine. X denotes the amino acid after the start codon.c, Frequency of protein N-terminal acetylation for amino acids inb. Because trypsin was used for protein digestion, the frequencies for Arg and Lys residues could not be determined (n.d.).d, Distribution of peptide-based sequence coverage of proteins in individual tissues and for the combined dataset (tissue abbreviations as in Fig.1). Boxes contain 50% of the data and show the median as a black line. The top and bottom quartile ranges are shown as whiskers. The number of proteins is indicated for each tissue.e, Pie charts showing the percentage of proteins identified by <3, 3–10 or >10 peptides either allowing shared (razor) peptides or restricting to unique peptides only.f, Left, number of protein isoforms detected at the transcript and protein level compared with the number of all annotated isoforms in Araport11. Right, number of multiple isoforms of the same gene distinguished at the peptide level.g, Validation of protein isoform and sORF identification by comparing the tandem mass spectra from the tissue atlas to those of synthetic peptide reference standards. The normalized spectral contrast angle (SA) was used as a similarity metric (Methods). Candidate isoforms and sORFs were considered valid if the spectral contrast angle of the spectra was >0.7. These data are reported in Supplementary Data 3.h, Amino acid sequence and mirror plots of tandem mass spectra for two peptides of the sORF BIP138_4. The spectra pointing upwards were collected from tissue digests; those pointing downwards were collected from synthetic peptides. The normalized spectral contrast angle and Pearson’s correlation coefficient (r) were used as similarity metrics (Methods) and indicate that both high-scoring spectra (n = 1 acquired spectra) are near identical, thus validating the identification of this sORF as an expressed protein.i, Dynamic range of transcript abundance (grey) and proportion of transcripts that were also identified at the protein level projected into this plot (blue). OM, orders of magnitude. Note that for lower abundance transcripts, fewer proteins were detected.j, Dynamic range of protein abundance and proportion of proteins with phosphorylation evidence. Protein abundance spans six orders of magnitude, whereas transcript abundance only spans four (i). In addition, note that phosphorylation was detected across the entire protein abundance range.k, Percentage of all annotated kinases (K), phosphatases (P), transcription factors (TF) and transcription regulators (TR) detected at the transcript, protein or phosphoprotein levels. Numbers below thex axis denote the number of genes for these protein classes in theA. thaliana genome.

Source data

Extended Data Fig. 3 Descriptive analysis of transcript and protein expression in tissues.

a, Distribution of expression specificity categories for protein and transcript identifications. See Methods for the definition of these categories. In brief, there are very few transcripts and proteins that are only expressed in a single tissue. The quantities of the shared transcripts or proteins can differ vastly between tissues (b).b, Left, protein identifications shared between flower (FL) and flower organs showing an almost complete qualitative overlap of proteins. Sepal (SP), petal (PT), stamen (ST), carpel (CP). Right, clustering ofz-scored protein intensities showing distinct quantitative expression differences between flower organs.c, Expression analysis of flower organ identity marker at the protein and transcript level. PISTILLATA (PI, green), APETALA3 (AP3, red), APETALA1 (AP1, orange), AGAMOUS (AG, blue). The expression of these markers is in line with the model of flower organ identity (AP1 expression marking sepal, AP1, AP3, PI marking petal, AG, AP3, PI marking stamen and AG marking carpel).d, Total number of transcripts plotted against the total number of proteins detected in each individual tissue (n = 30 tissues) showing that the more genes are expressed as mRNAs, the more proteins can be detected in a tissue (Pearson’s correlationr = 0.79). Tissues are coloured according to tissue groups as in Fig.1.e, Cumulative abundance plots of intensity-ranked identifications of transcripts and proteins for five representative tissues. The five most abundant transcripts and proteins are listed in descending order for each tissue. These are generally not the same. In addition, note that the characteristics of the plots are not the same for all tissues. In flower, the protein line rises more quickly than the transcript line. The opposite is true for pollen and a more even characteristic is observed in seed.f, Distribution of shared and unique identifications among the 100 most abundant transcripts and proteins in each tissue. Relatively few proteins and transcripts are found together on the list of the 100 most abundant transcripts and proteins. This demonstrates that the quantitative differences in transcript and protein expression are more important in defining a tissue than the qualitative expression of transcripts or proteins.g, List of 11 proteins that were found as the most abundant protein (in at least one tissue) and their proportion of the total iBAQ intensity in each tissue. Individual proteins can represent up to 9% of the total protein in a given tissue.h, Principal component analysis (PCA) of the core tissue proteomes and transcriptomes (that is, the proteins and transcripts that were identified in every tissue) usingz-scored abundances. Only about 30% of all protein and 20% of all mRNAs were detected in every of the 30 tissues despite the fact that all tissues were deeply profiled at both protein and transcript level. This shows that strong qualitative and quantitative expression differences exist between tissues. The PCA separates tissues into photosynthetically active versus inactive tissues (component 1) and separates pollen from all other tissues (component 2), indicating that the molecular composition of pollen is particularly different from all other tissues.i, Proportion of the total summed protein intensity for genes with specific subcellular compartment annotation (from SUBA77;Methods) in the different tissue groups. The comparison of photosynthetically active and inactive tissues shows that most of the protein content in photosynthetically active tissues is contained in the plastids, whereas most protein is found in the cytosol for photosynthetically inactive tissues. Proteins with only one single subcellular compartment annotation were selected for the plot and the proportion of their iBAQ intensities were averaged for each tissue group. Nucleus (n = 1,393), endoplasmatic reticulum (n = 58), Golgi (n = 68), peroxisome (n = 67), plastid (n = 525), mitochondrion (n = 317), vacuole (n = 71), cytosol (n = 385), cytoskeleton (n = 1), plasma membrane (n = 268), extracellular (n = 351).

Source data

Extended Data Fig. 4 Relationships between transcript and protein levels.

a, Pearson’s correlation (r) of transcriptome and proteome expression (core datasets;n = 5,043) for each tissue.b, Pearson’s correlation between measured and predicted protein abundance levels in all tissues. Predicted protein abundance levels were obtained from the best fitting feature selection model for each tissue (Methods). The number of genes used for the correlation analysis is indicated for each tissue.c, Violin plots showing the spread in relative contribution of selected features to the prediction of gene-level protein abundance across tissues (n = 30 tissues) using our model. Violin shapes show the kernel density estimation of the data distribution and the median as white dot. Thick black bars denote the interquartile range.d, Specific nucleotide sequence motifs in 5′ UTRs of mRNAs contribute to the prediction of protein levels in a subset of tissues. Clustering tissues based on the presence or absence of detected 5′ UTR motifs shows that several features are repeatedly selected for inclusion in the model while others appear to be more tissue-specific.e, On the basis of the observation that thedN/dS ratio between orthologous ofA. thaliana andA. lyrata contributed to the prediction of protein levels (c), we analysed this feature in more detail. Left, distribution of thedN/dS ratio for orthologous genes inA. thaliana andA. lyrata. The distribution is plotted for the example of ‘leaf distal’ (n = 6,447 genes). To compare evolutionarily conserved genes (defined by lowdN/dS ratios) and genes that evolve neutrally or are under positive selection (highdN/dS ratios), we selected the bottom 5% and top 5% of thedN/dS ratio distribution, respectively. Right, evolutionarily conserved genes (lowdN/dS ratio) show 10–20 times higher protein abundance than genes under evolutionary pressure. Boxes contain 50% of the data and show the median as a black line. Whiskers denote 1.5 times the interquartile range. Outliers were omitted from the plot for clarity.f, Time-course analysis of median protein abundance changes after treatment with CHX (translation block) or MG132 (proteasome block) versus time-matched DMSO control samples (Methods). Boxes contain 50% of the data and show medians as black lines. Whiskers denote 1.5 times the interquartile range. Outliers were omitted from the plot for clarity but were included in the statistical tests below. All proteins in the experiment (n = 8,920, grey), proteins that have a high PTR in seed (n = 425, red) or a low PTR in seed (n = 254, blue) (defined as in Fig.3d) are shown. Differences between time points were tested for significance within each subset (all; high PTR; low PTR) using one-way ANOVA and the post hoc Tukey HSD test. ***P < 0.001 (all_CHX8–CHX16:P < 1 × 10−7; all_CHX8–CHX24:P < 1 × 10−7; all_CHX16–CHX24:P = 0.0002; highPTR_CHX8–CHX24:P = 0.0003; lowPTR_CHX8–CHX16:P = 0.0000004; lowPTR_CHX8–CHX24:P < 1 × 10−7; lowPTR_CHX16–CHX24:P < 1 × 10−7).g, Representative images of seeds after 4 days of incubation with CHX, MG132 or DMSO control medium (n = 1). Germination was completely inhibited by CHX and partially inhibited by MG132, showing that the drug treatments were effective.

Source data

Extended Data Fig. 5 Correlations between transcriptomes, proteomes and phosphoproteomes.

a, Median PTRs across tissues plotted against the inter-tissue variation of these PTRs (expressed as MAD; proteins and transcripts had to be detected in at least 10 matching tissues to be included in the analysis). Arrows denote examples of genes with high PTRs (rbcL andpetA) and low PTRs (IAA8 andIAA13). Bar plot shows the MAD range segmented into five quantiles, each containing the same number of genes (coloured bars and dashed lines). Most genes have reasonably stable PTRs across tissues.b, As ina (datasetn = 14,069) but for transcript (left) and protein (right) measurements. There is more variation in protein levels across tissues than there is mRNA variation (80% of all transcripts show a MAD of <1; 80% of all proteins show a MAD of 1.2). There is also more variation in the protein levels across tissues for low abundant proteins. This may in part be due to technical limitations as low abundance proteins can generally be less accurately quantified.c, As ina but for the ratio of phosphorylation site versus protein abundance. P-sites and proteins had to be detected in at least 10 matching tissues to be included in the analysis (n = 13,793).d, As inb (datasetn = 13,793) but for P-site abundance. P-site abundance shows greater variation across tissues than protein abundance (60% of all P-sites show MAD <1 compared with 80% of all proteins; seeb). Again, this may in part be due to technical limitations as P-site quantification is performed on a peptide level and does not benefit from aggregating multiple peptide quantifications into one value for protein quantification.

Source data

Extended Data Fig. 6 Inferring redundant gene function and physical interactions from co-expression analysis.

a, Scatter plot of Pearson’s correlation coefficients (r) as a measure for co-expression across tissues for all pairs of proteins (x axis) and all pairs of transcripts (y axis) (core dataset only,n = 5,043) along with their marginal histograms. Colours denote the log10-normalized STRING scores of individual gene pairs as a measure of known or predicted direct (physical) or indirect (functional) associations. Strong co-expression of transcripts or proteins or both are more strongly related (physically or functionally) than transcripts and proteins that are not.b, Co-expression analysis of duplicated genes (pairs had to be detected in at least 10 matching tissues to be included in the analysis). The density plots show the distribution of Pearson’s correlation coefficients (r) of co-expressed transcripts (grey) or proteins (blue) for genes that arose by whole-genome duplications (WGD), local duplications or transposon-mediated duplications. Randomly selected gene pairs are shown as a control. Medians are given and displayed as dotted lines. There is substantial co-expression of duplicated genes, indicating that these genes probably have redundant functions.c, Left, protein-level Pearson’s correlation coefficient (r) values (fromb) for all duplicate gene pairs (WGD, local, transposed) plotted against the protein abundance ratio of each pair (average across 30 tissues) (Methods). Blue arrows denote an example of a high or low ratio of protein production for the duplicated genes. Right, example for tissue-resolved protein intensity proportions (top-3) (Methods) for the duplicate pairMAC5A andMAC5B. Irrespective of the tissue,MAC5A is always much higher expressed thanMAC5B. Tissues are coloured as in Fig.1.d, Top, ranked protein abundance ratio for selected duplicate pairs (mean ± s.d.;n = 30) and annotated for phenotypic effects (bottom) in the loss-of-function mutant for either duplicate 1 or duplicate 2 (+). Minus symbols denotes absence of a phenotypic effect. Asymmetric protein production within duplicate pairs can be associated with the occurrence of a phenotype in the loss-of-function mutant of the higher expressed duplicate protein, indicating a dominant functional role of the more highly expressed protein. Blue arrows highlight MAC5A–MAC5B and PHB3–PHB4 as examples.e, Inference of physical protein–protein interactions from co-expression data. Distribution of pairwise Pearson’s correlation coefficients (r) of co-expressed proteins across (at least 10) tissues that are subunits of selected protein complexes.r > 0.5 (shaded in grey) was chosen as a cut-off for the selection of proteins for subsequent analysis to make sure that proteins present in well-characterized protein complexes are retained. CONSTITUTIVE PHOTOMORPHOGENESIS9 SIGNALOSOME (CSN), CELLULOSE SYNTHASE (CESA).f, Recovery of annotated protein–protein interactions by co-expression analysis. Distribution of Pearson’s correlation coefficients (r) of pairs of transcripts (grey) or protein (blue) that are annotated to interact physically in the AtPIN database33 (pairs had to be detected in at least 10 matching tissues to be included in the analysis). Subsets of the AtPIN database, namely interactions detected by the yeast two-hybrid (Y2H) method, by affinity purification–mass spectrometry (AP–MS) or both.r > 0.5 are shaded in blue (protein). Dotted lines denote median values. Co-expression only recovers a minority of annotated physical interactions andinteractions supported by more than one line of experimental evidence also tend to show stronger co-expression.

Source data

Extended Data Fig. 7 Inferring protein complexes and subunit stoichiometry from proteome correlation profiling using SEC–MS.

a, Molecular mass (MW) of monomeric proteins (determined from sequence) plotted against the mass determined from the apex of the elution profile for proteins identified by SEC–MS fractions of flower tissue (sFL). Inset shows the molecular mass calibration of the SEC column using a protein calibration standard (mass between 44 and 690 kDa). The distribution of proteins annotated in Araport11 is shown at the top. Many proteins show a much higher apparent molecular mass than would be expected from their sequences (data points above thex = y line). This suggests that these proteins engage in physical protein interactions that are sufficiently stable during SEC separation.b, SEC traces of proteins from five well-characterized protein complexes for flower, leaf and root tissue. Although the resolution of SEC separations is not very high, the complex subunits show very strong co-elution behaviour and the SEC separations of the five complexes are reproducible between tissues. CoA carboxylasen = 4 proteins; CDC48n = 3 proteins; RubisCOn = 4 proteins; prefoldinn = 6 proteins; SCSn = 3 proteins.c, Intensity-normalized SEC elution profile of proteins for flower tissue. Proteins are ordered based on the SEC fraction in which their intensity peaks and the data are displayed as a heat map (n = 2,485 protein traces). Co-eluting proteins were grouped into ‘trace modules’ (Methods). Proteins in trace modules may represent members of protein complexes and thus serve as candidates for further experimental validation.d, To quantify how well protein complexes can be detected using co-expression analysis from data in the tissue atlas (TA) or by SEC–MS, a summary statistic termed ‘complex index’ was calculated (Methods). The complex index is 1 when all subunits of a complex are identified in the same module and no other proteins are contained in the module. Bar plots show examples for complex indices obtained from the different datasets and are divided into large (>4 subunits) and small (≤4 subunits) protein complexes (according to UniProt). Co-expression alone generates many candidates of interactors, but combining co-expression and SEC–MS analysis is an efficient way to prioritize candidates for follow-up experiments.e, Subunit heterogeneity within the coatomer complex. The coatomer complex consists of seven subunits, five of which (α, β, β′, ε and ζ) can be provided by twelve paralogues of these five genes. Plots show the protein proportions of these paralogues in all 30 tissues (data from tissue atlas). The coatomer complex has a similar composition in most tissues. A notable exception is seed tissues, in which production of subunit ζ-1 dominates over the two other paralogous proteins, suggesting that the coatomer complex in seed tissue also preferentially contains the ζ-1 subunit. Tissues are coloured as in Fig.1.f, Absolute SEC intensity traces of individual complex subunits for determining subunit stoichiometry. Examples from left to right: the chaperonin complex (flower, 8 proteins, ratio of all subunits: 1:1), the 26S proteasome core and lid (flower, 14+17 proteins, ratio of all subunits: 1:1), the COP9 signalosome (flower, CSN; 8 proteins, ratio of all subunits: 1:1) and the CESA1–CESA3–CESA6 complex (root, 3 proteins, ratio of all subunits: 1:1). CSN3 and CSN5 were detected both as part of the CSN complex and in monomeric form.g, Top, total intensity of protein complex subunits across all tissues for the complexes shown inf (subunit intensities from the tissue atlas). Middle, relative proportion (mean ± s.d.;n = 30 tissues) of subunits across tissues (Methods). For the CESA complex, ratios were calculated for the subunit combinations CESA1–CESA3–CESA6 and CESA4–CESA7–CESA8. The stoichiometries determined from the tissue expression data are generally well-aligned with the expected 1:1 ratio of subunits in these complexes. As noted inf, a substantial amount of CSN5 was detected as a monomer in the SEC analysis, and the tissue expression atlas also shows higher relative expression of this protein compared with all other complex partners. This suggests that the protein is produced in excess over what is required for the COP9 complex (as observed previously123), and may therefore indicate an additional function within the cell.

Source data

Extended Data Fig. 8 Kinases, phosphatases and phosphorylation motifs.

a, Percentage of annotated kinases and phosphatases family members detected at the protein or phosphoprotein level. Parentheses denote the number of genes in each family in theArabidopsis genome.b, Tissue-resolved combined intensity (that is, protein abundance) of families of kinases (left) and phosphatases (right). Tissues are coloured as in Fig.1. Several tissues (notably pollen) stand out in terms of the expression of kinases and phosphatases, which indicates that these tissues are particularly active in phosphorylation-mediated dynamic signalling.c, Top, pie chart of specificity categories for kinases and phosphatases (see Methods for definition). Bottom, distribution of tissue-enhanced kinases and phosphatases across the 30 tissues. Several tissues (such as pollen) stand out in terms of the expression of certain kinases and phosphatases, which indicates tissue-specific signalling.d, Pie charts showing the proportion of proline-directed, acidic, basic and other motif categories for phosphorylated Ser (pS), Thr (pT) and Tyr (pY) residues. Only class I P-sites (localization score > 0.75;Methods) were considered in this analysis.e, Example motif logo plots for motifs such as proline-directed, acidic and basic. P-site motifs were identified using the motif-X algorithm (see Supplementary Table2 for all 266 motifs).n denotes the number of phosphorylation sites that contain the respective motif; ‘fc’ denotes the fold change (that is, enrichment) of the motif in phosphorylated versus unmodified peptides (Methods).f, Enrichment of proline-directed (yellow), acidic (red), basic (blue) and other (grey) sequence motifs (circles) in the serine P-site dataset versus the same motifs detected in the background dataset of unmodified peptides (Methods). Motifs are shown for two, three and four fixed amino acid positions. The P-site in each motif example is underlined. ‘X’ denotes any amino acid.g, Number of identified P-sites for a given protein plotted against the sequence lengths of the same protein. LEA proteins are shown.h, Schematic of the LEA protein sequences (black bars). Pink denotes phosphorylated and blue denotes unphosphorylated STY residues. Almost all STY residues in LEA proteins can be phosphorylated.i, Schematic of the sequences and domain topology of the receptor-like kinases SRF4, FER and CERK1. P-sites often preferentially occur in specific domains, notably the juxtamembrane domain. Protein sequence regions covered by identified peptides are marked in blue, and P-sites are marked in pink.

Source data

Extended Data Fig. 9 Functional analysis of phosphorylation mutants of RCAR10 and QKY.

a, P-site localization within the structure of RCAR10. The RCAR10 structure (blue) was modelled using the RCAR11 protein crystal structure (cornflower blue) as a template115. ABA-binding loops are shown in turquoise, P-sites in pink, and ABA ligand in yellow.b, RCAR10 expression across tissues at the protein (blue, iBAQ), transcript (grey, TPM) and P-site (pink, intensity) level.c, Tissue-resolved total protein intensity and relative proportions of the members of the PP2C co-receptor family. Seed tissues stand out in terms of overall expression as well as the dominance of AHG1 in these tissues.d, Measurement of the ABA response after expression of RCAR10 or phosphomimetic mutant variants in combination with different PP2C co-receptors in protoplasts (Methods). Columns display the average ABA response (mean ± s.d.,n = 3) and grey dots indicate individual measurements. Co-expression of the phosphatases HAI1–HAI3 leads to similar responses in both phosphomimetic mutants, whereas other co-expressed phosphatases show diverse responses.e, QKY expression across tissues at the protein (blue, iBAQ), transcript (grey, TPM) and P-site (pink, intensity) level.f, Members of the MCTP family clustered by sequence similarity (left) and schematic of their domain structures along with detected P-sites (right). MCTP11a, MCTP12 and MCTP13 were not detected (n.d.) in this study. MCTP15 (also known as QKY) is in bold.g, Number of independent transgenic plant lines (qky-9 mutant background) transformed with wild-type QKY, phosphomutant (S262A, SA; blue) or phosphomimetic (S262E, SE; purple) constructs that show complete, partial or no rescue of the mutant phenotype. qPCR results (mean ± s.d.,n = 3; individual data points as grey dots) show the relative transgene expression in wild-type,qky-9 mutant and selected transgenic lines.hj, Representative confocal images of six-day-oldqky-9pQKY::mCherry:QKY (WT QKY;n = 14 roots),qky-9pQKY::mCherry:QKY(S262A) (phosphomutant;n = 21 roots),qky-9pQKY::mCherry:QKY(S262E) (phosphomimetic;n = 15 roots) root epidermal cells of the meristematic zone. The punctate signal along the cell circumference shows the expected localization of the QKY protein. Arrows indicate punctate structures. Scale bars, 5 μm.

Source data

Supplementary information

Supplementary Tables: 

This file contains Supplementary Table 1: List of validated peptide spectra for uncertain proteins including mirror plots between experimental and predicted spectra. Supplementary Table 2: Position weight matrix logo plots for all identified serine, threonine and tyrosine phosphorylation motifs.

Supplementary Data

Supplementary Data 1: This file contains tissue sample names and a description of their growth conditions.

Supplementary Data

Supplementary Data 2: This file contains the tissue atlas expression values on transcriptome, proteome and p-sites level.

Supplementary Data

Supplementary Data 3: This file contains a summary of expression evidence for gene isoforms and sORF sequences.

Supplementary Data

Supplementary Data 4: This file contains the results for 1D GO term enrichment analyses using transcript abundance or protein to RNA ratio (PTR) values as input.

Supplementary Data

Supplementary Data 5: This file contains a description of feature analysis parameters used in the protein abundance prediction model.

Supplementary Data

Supplementary Data 6: This file contains protein expression values for selected paralog genes and information about their mutational phenotypes.

Supplementary Data

Supplementary Data 7: This file contains size exclusion chromatography protein elution profiles for flower, leaf and root tissue.

Supplementary Data

Supplementary Data 8: This file contains GO term enrichment analysis for WGCNA protein and trace modules and lists gene identifiers in protein and trace module intersections.

Supplementary Data

Supplementary Data 9: This file contains annotations of selected complexes and their identification in tissue atlas or size-exclusion chromatography experiments.

Supplementary Data

Supplementary Data 10: This file lists all identified serine, threonine and tyrosine phosphorylation motifs.

Supplementary Data

Supplementary Data 11: This file contains a list of all primer sequences and gene IDs used in this study.

Rights and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mergner, J., Frejno, M., List, M.et al. Mass-spectrometry-based draft of theArabidopsis proteome.Nature579, 409–414 (2020). https://doi.org/10.1038/s41586-020-2094-2

Download citation

Access through your institution
Buy or subscribe

Associated content

Arabidopsis proteome v2.0

  • Guillaume Tena
Nature PlantsResearch Highlight

Proteomic and transcriptomic profiling of aerial organ development in Arabidopsis

  • Julia Mergner
  • Martin Frejno
  • Bernhard Kuster
Scientific DataData DescriptorOpen Access

Advertisement

Search

Advanced search

Quick links

Nature Briefing

Sign up for theNature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox.Sign up for Nature Briefing

[8]ページ先頭

©2009-2025 Movatter.jp