Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

PeerJ, Inc. full text link PeerJ, Inc. Free PMC article
Full text links

Actions

Share

doi: 10.7717/peerj.11117. eCollection 2021.

Multi-schema computational prediction of the comprehensive SARS-CoV-2 vs. human interactome

Affiliations

Multi-schema computational prediction of the comprehensive SARS-CoV-2 vs. human interactome

Kevin Dick et al. PeerJ..

Abstract

Background: Understanding the disease pathogenesis of the novel coronavirus, denoted SARS-CoV-2, is critical to the development of anti-SARS-CoV-2 therapeutics. The global propagation of the viral disease, denoted COVID-19 ("coronavirus disease 2019"), has unified the scientific community in searching for possible inhibitory small molecules or polypeptides. A holistic understanding of the SARS-CoV-2 vs. human inter-species interactome promises to identify putative protein-protein interactions (PPI) that may be considered targets for the development of inhibitory therapeutics.

Methods: We leverage two state-of-the-art, sequence-based PPI predictors (PIPE4 & SPRINT) capable of generating the comprehensive SARS-CoV-2 vs. human interactome, comprising approximately 285,000 pairwise predictions. Three prediction schemas (all,proximal,RP-PPI) are leveraged to obtain our highest-confidence subset of PPIs and human proteins predicted to interact with each of the 14 SARS-CoV-2 proteins considered in this study. Notably, the use of the Reciprocal Perspective (RP) framework demonstrates improved predictive performance in multiple cross-validation experiments.

Results: Theall schema identified 279 high-confidence putative interactions involving 225 human proteins, theproximal schema identified 129 high-confidence putative interactions involving 126 human proteins, and theRP-PPI schema identified 539 high-confidence putative interactions involving 494 human proteins. The intersection of the three sets of predictions comprise the seven highest-confidence PPIs. Notably, the Spike-ACE2 interaction was the highest ranked for both the PIPE4 and SPRINT predictors with theall andproximal schemas, corroborating existing evidence for this PPI. Several other predicted PPIs are biologically relevant within the context of the original SARS-CoV virus. Furthermore, the PIPE-Sites algorithm was used to identify the putative subsequence that might mediate each interaction and thereby inform the design of inhibitory polypeptides intended to disrupt the corresponding host-pathogen interactions.

Conclusion: We publicly released the comprehensive sets of PPI predictions and their corresponding PIPE-Sites landscapes in the following DataVerse repository: https://www.doi.org/10.5683/SP2/JZ77XA. The information provided represents theoretical modeling only and caution should be exercised in its use. It is intended as a resource for the scientific community at large in furthering our understanding of SARS-CoV-2.

Keywords: Comprehensive interactome; Inter-species interaction prediction; Machine learning; Protein–protein interaction; SARS-CoV-2.

©2021 Dick et al.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

Figure 1
Figure 1. Overview of the three prediction strategies to generate the SARS-CoV-2 vs. human interactome.
The three schemas depict how known PPIs are leveraged to train a prediction model to generate predictions for SARS-CoV-2.
Figure 2
Figure 2. Example compilation of the spike protein one-to-all score curve, knee detection for local cut-off, and rank order predictions, for each method.
(A & D) depict the one-to-all score curves from the predicted score between the Spike protein and all proteins in the human proteome. (B & E) depict the detected knee from the top-1000 scores in each one-to-all score curve where the dashed line represents the knee detected using the Kneedle algorithm. (C & F) depict the predicted interactions above the knee.
Figure 3
Figure 3. Venn Diagram of the human proteins predicted to interact with SARS-CoV-2 proteins.
(A), (B), and (C) depict the number of predicted pairs for each of the schema’s putative interactomes. In (D), those interactomes are combined further by taking their intersections with the highest confidence subset comprisingn = 7 pairs.
Figure 4
Figure 4. Network visualization of the highest-confidence predictions.
The colour-function relationship is depicted in Fig. 11. Created using Cytoscape (Shannon et al., 2003).
Figure 5
Figure 5. Life cycle of CoVs and the mode of future peptide inhibitors (PI) Derived from this Study.
(I) SARS-CoV-2 attaches to the cell surface via interaction of the spike (S) protein with the host ACE2 receptor. (II) The CoV and host membranes coalesce either at the cell surface or within endosomes, releasing the CoV genome into the cytoplasm. (III) Host ribosomes use the CoV genome as a template and translate polyproteins 1a and 1ab. (IV) The polyproteins mature into individual non-structural proteins (NSPs) 1-16 via autoproteolytic processing. (V) Multiple NSPs form a viral replicase complex, which performs negative-strand synthesis of genomeic and subgenomic RNA negative-strand templates. (VI) The viral replicase synthesizes nascent plus-strands of the full-length CoV genome and subgenomic RNAs encoding structural (S, E, M, N) and accessory proteins (not shown). (VII) S, Membrane (M), and Envelope (E) proteins are translated at the endoplasmic reticulum (ER) and inserted into the ER membrane. The Nucleocapsid (N) protein is translated within the cytoplasm. (VIII) The N protein encapsulates the nascent CoV genome and interacts with the other structural proteins within the ER-Golgi intermediate compartment (ERGIC). (IX) Mature CoV particles are formed within vesicles upon budding into the lumen of the ERGIC. (X) CoV particles are released upon exocytosis. Besides the validated S-ACE2 interaction, other notable predicted protein-protein interactions are indicated by dashed arrows. This figure was made in ©BioRender - biorender.com.
Figure 6
Figure 6. The PIPE-sites landscape between the SARS-CoV-2 Spike protein and human ACE2 protein.
Within each panel, the three red rectangles represent the predicted PIPE-Sites regions. (A) depicts the completely predicted landscape with the complete numerical range of scores depicted. To more easily visualize the high-scoring subsequence regions, (B & C) apply a numerical “capped threshold” where any value greater than or equal to the maximum threshold is set to that value. This has the effect of emphasizing the regions of potential interest. A threshold of 3.0 is applied in B and a threshold of 1.0 in C. See the Supplemental Information for guidance on the interpretation of these landscapes.
Figure 7
Figure 7. Dot plot of the BLASTp alignment of the SARS-CoV and SARS-CoV-2 Spike protein.
The alignment of the two proteins results in amax score of 2039, atotal score of 2039, 100% coverage, anE-value of 0.0, and 76.04% identity. Specifically: 971/1277 (76%) identities, 1109/1277 (86%) positives, and 26/1277 (2%) gaps. Arrows indicate gaps within the alignment and the zoomed-in region highlights the six mismatches around residue 420.
Figure 8
Figure 8. Landscapes of the six predicted HLA interactors with SARS-CoV-2 Protein 3a.
The three red rectangles represent the predicted PIPE-Sites regions. They’re “shifted” relative to the highlighted cells due to the algorithm’s use of a window of 20 amino acids in length that extends both to the left (along thex-axis) and upwards (along they-axis). This implementation may also result in the predicted site extending past the coloured matrix, either to the right or above. The PIPE-Sites may overlap when numerous hits appear within close proximity, as is the case when a “band of hits appears in the matrix. See the supplementary material for guidance on the interpretation of these landscapes.
Figure 9
Figure 9. Hit and SW landscapes for the four predicted hnRNPs to guide the design of peptide inhibitors.
Both “hotspots” and “bands” identify subsequence regions of interest to target with peptide inhibitors. See the Supplemental Information for guidance on the interpretation of these landscapes
Figure 10
Figure 10. Network visualization of the 14 predicted pairs involving one of the 332 human proteins from gordon2020sars.
Created using Cytoscape shannon2003cytoscape.
Figure 11
Figure 11. Network visualization of the complete predicted interactomes for each schema.
(A) All schema. (B) Proximal schema. (C) RP-PPI Schema.
See this image and copyright information in PMC

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Amos-Binks A, Patulea C, Pitre S, Schoenrock A, Gui Y, Green JR, Golshani A, Dehne F. Binding site prediction for protein-protein interactions and novel motif discovery using re-occurring polypeptide sequences. BMC Bioinformatics. 2011;12(1):225. doi: 10.1186/1471-2105-12-225. - DOI - PMC - PubMed
    1. Baltimore D. Expression of animal virus genomes. Bacteriological Reviews. 1971;35(3):235–241. doi: 10.1128/BR.35.3.235-241.1971. - DOI - PMC - PubMed
    1. Calderone A, Licata L, Cesareni G. VirusMentha: a new resource for virus-host protein interactions. Nucleic Acids Research. 2015;43(D1):D588–D592. doi: 10.1093/nar/gku830. - DOI - PMC - PubMed
    1. Chang C-k, Sue S-C, Yu T-h, Hsieh C-M, Tsai C-K, Chiang Y-C, Lee S-j, Hsiao H-h, Wu W-J, Chang W-L, Lin C-H, Huang T-h. Modular organization of SARS coronavirus nucleocapsid protein. Journal of Biomedical Science. 2006;13(1):59–72. doi: 10.1007/s11373-005-9035-9. - DOI - PMC - PubMed
    1. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016. pp. 785–794.

LinkOut - more resources

Full text links
PeerJ, Inc. full text link PeerJ, Inc. Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp