Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

Silverchair Information Systems full text link Silverchair Information Systems Free PMC article
Full text links

Actions

.2024 Feb 7;6(1):lqae011.
doi: 10.1093/nargab/lqae011. eCollection 2024 Mar.

SumoPred-PLM: human SUMOylation and SUMO2/3 sites Prediction using Pre-trained Protein Language Model

Affiliations

SumoPred-PLM: human SUMOylation and SUMO2/3 sites Prediction using Pre-trained Protein Language Model

Andrew Vargas Palacios et al. NAR Genom Bioinform..

Abstract

SUMOylation is an essential post-translational modification system with the ability to regulate nearly all aspects of cellular physiology. Three major paralogues SUMO1, SUMO2 and SUMO3 form a covalent bond between the small ubiquitin-like modifier with lysine residues at consensus sites in protein substrates. Biochemical studies continue to identify unique biological functions for protein targets conjugated to SUMO1 versus the highly homologous SUMO2 and SUMO3 paralogues. Yet, the field has failed to harness contemporary AI approaches including pre-trained protein language models to fully expand and/or recognize the SUMOylated proteome. Herein, we present a novel, deep learning-based approach called SumoPred-PLM for human SUMOylation prediction with sensitivity, specificity, Matthew's correlation coefficient, and accuracy of 74.64%, 73.36%, 0.48% and 74.00%, respectively, on the CPLM 4.0 independent test dataset. In addition, this novel platform uses contextualized embeddings obtained from a pre-trained protein language model, ProtT5-XL-UniRef50 to identify SUMO2/3-specific conjugation sites. The results demonstrate that SumoPred-PLM is a powerful and unique computational tool to predict SUMOylation sites in proteins and accelerate discovery.

© The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The overall framework of SumoPred-PLM. Beads with letters represent protein sequences. The sky-colored rectangular box represents ProtT5 PLM. Green rectangular boxes are per residue 1024 features representations produced by ProtT5 PLM. The empty circle represents neurons. Each neuron is connected to other nodes via links like a biological axon-synapse-dendrite connection. A dropout of 0.3 means, 30% of neurons are switched off randomly while training the MLP.
Figure 2.
Figure 2.
Comparisons of ROC curves of SumoPred-PLM and other models on the SUMOylation CPLM 4.0 independent test dataset. For each model, the area under the ROC curve is reported.
Figure 3.
Figure 3.
Comparison of precision-recall curves of SumoPred-PLM and other models on the SUMOylation CPLM 4.0 independent test dataset. For each model, the area under the PrAUC is reported.
Figure 4.
Figure 4.
t-SNE illustration of the learned features from ProtT5 language model.
Figure 5.
Figure 5.
t-SNE illustration of the learned features from the trained MLP model.
Figure 6.
Figure 6.
ROC curve of SumoPred-PLM on GPS-SUMO independent test dataset.
Figure 7.
Figure 7.
SumoPred-PLM prediction results of human androgen receptor, where sites with a prediction score above 0.5 (shown by the red dotted line) are predicted as SUMOylated sites. Green bars represent the five SUMOylation sites with experimental evidence from protein microarray data.
See this image and copyright information in PMC

References

    1. Olsen J.V., Mann M.. Status of large-scale analysis of post-translational modifications by mass spectrometry. Mol. Cell. Proteomics. 2013; 12:3444–3452. - PMC - PubMed
    1. Jensen O.N. Interpreting the protein language using proteomics. Nat. Rev. Mol. Cell Biol. 2006; 7:391–403. - PubMed
    1. Flotho A., Melchior F.. Sumoylation: a regulatory protein modification in health and disease. Annu. Rev. Biochem. 2013; 82:357–385. - PubMed
    1. Beauclair G., Bridier-Nahmias A., Zagury J.F., Saib A., Zamborlini A.. JASSA: a comprehensive tool for prediction of SUMOylation sites and SIMs. Bioinformatics. 2015; 31:3483–3491. - PubMed
    1. Kumar A., Zhang K.Y.. Advances in the development of SUMO specific protease (SENP) inhibitors. Comput. Struct. Biotechnol. J. 2015; 13:204–211. - PMC - PubMed

Associated data

Grants and funding

LinkOut - more resources

Full text links
Silverchair Information Systems full text link Silverchair Information Systems Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2026 Movatter.jp