Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information
- PMID:16895930
- DOI: 10.1093/bioinformatics/btl423
Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information
Abstract
Motivation: Human single nucleotide polymorphisms (SNPs) are the most frequent type of genetic variation in human population. One of the most important goals of SNP projects is to understand which human genotype variations are related to Mendelian and complex diseases. Great interest is focused on non-synonymous coding SNPs (nsSNPs) that are responsible of protein single point mutation. nsSNPs can be neutral or disease associated. It is known that the mutation of only one residue in a protein sequence can be related to a number of pathological conditions of dramatic social impact such as Alzheimer's, Parkinson's and Creutzfeldt-Jakob's diseases. The quality and completeness of presently available SNPs databases allows the application of machine learning techniques to predict the insurgence of human diseases due to single point protein mutation starting from the protein sequence.
Results: In this paper, we develop a method based on support vector machines (SVMs) that starting from the protein sequence information can predict whether a new phenotype derived from a nsSNP can be related to a genetic disease in humans. Using a dataset of 21 185 single point mutations, 61% of which are disease-related, out of 3587 proteins, we show that our predictor can reach more than 74% accuracy in the specific task of predicting whether a single point mutation can be disease related or not. Our method, although based on less information, outperforms other web-available predictors implementing different approaches.
Availability: A beta version of the web tool is available at http://gpcr.biocomp.unibo.it/cgi/predictors/PhD-SNP/PhD-SNP.cgi
Similar articles
- Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans.Capriotti E, Arbiza L, Casadio R, Dopazo J, Dopazo H, Marti-Renom MA.Capriotti E, et al.Hum Mutat. 2008 Jan;29(1):198-204. doi: 10.1002/humu.20628.Hum Mutat. 2008.PMID:17935148
- Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information.Bao L, Cui Y.Bao L, et al.Bioinformatics. 2005 May 15;21(10):2185-90. doi: 10.1093/bioinformatics/bti365. Epub 2005 Mar 3.Bioinformatics. 2005.PMID:15746281
- Predicting protein stability changes from sequences using support vector machines.Capriotti E, Fariselli P, Calabrese R, Casadio R.Capriotti E, et al.Bioinformatics. 2005 Sep 1;21 Suppl 2:ii54-8. doi: 10.1093/bioinformatics/bti1109.Bioinformatics. 2005.PMID:16204125
- Computational prediction of the effects of non-synonymous single nucleotide polymorphisms in human DNA repair genes.Nakken S, Alseth I, Rognes T.Nakken S, et al.Neuroscience. 2007 Apr 14;145(4):1273-9. doi: 10.1016/j.neuroscience.2006.09.004. Epub 2006 Oct 19.Neuroscience. 2007.PMID:17055652Review.
- Bioinformatics tools for single nucleotide polymorphism discovery and analysis.Clifford RJ, Edmonson MN, Nguyen C, Scherpbier T, Hu Y, Buetow KH.Clifford RJ, et al.Ann N Y Acad Sci. 2004 May;1020:101-9. doi: 10.1196/annals.1310.011.Ann N Y Acad Sci. 2004.PMID:15208187Review.
Cited by
- Using In Silico Bioinformatics Algorithms for the Accurate Prediction of the Impact of Spike Protein Mutations on the Pathogenicity, Stability, and Functionality of the SARS-CoV-2 Virus and Analysis of Potential Therapeutic Targets.Alizadehmohajer N, Zahedifar S, Sohrabi E, Shaddel Basir S, Nourigheimasi S, Falak R, Nedaeinia R, A Ferns G, Emami Nejad A, Manian M.Alizadehmohajer N, et al.Biochem Genet. 2023 Apr;61(2):778-808. doi: 10.1007/s10528-022-10282-9. Epub 2022 Sep 29.Biochem Genet. 2023.PMID:36173498Free PMC article.
- Crohn's disease risk alleles on the NOD2 locus have been maintained by natural selection on standing variation.Nakagome S, Mano S, Kozlowski L, Bujnicki JM, Shibata H, Fukumaki Y, Kidd JR, Kidd KK, Kawamura S, Oota H.Nakagome S, et al.Mol Biol Evol. 2012 Jun;29(6):1569-85. doi: 10.1093/molbev/mss006. Epub 2012 Jan 12.Mol Biol Evol. 2012.PMID:22319155Free PMC article.
- Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer.Peterson TA, Nehrt NL, Park D, Kann MG.Peterson TA, et al.J Am Med Inform Assoc. 2012 Mar-Apr;19(2):275-83. doi: 10.1136/amiajnl-2011-000655.J Am Med Inform Assoc. 2012.PMID:22319177Free PMC article.
- Does conserved domain SOD1 mutation has any role in ALS severity and therapeutic outcome?Pal S, Tiwari A, Sharma K, Sharma SK.Pal S, et al.BMC Neurosci. 2020 Oct 9;21(1):42. doi: 10.1186/s12868-020-00591-3.BMC Neurosci. 2020.PMID:33036560Free PMC article.
- Mechanism of Action of Non-Synonymous Single Nucleotide Variations Associated withα-Carbonic Anhydrase II Deficiency.Sanyanga TA, Nizami B, Bishop ÖT.Sanyanga TA, et al.Molecules. 2019 Nov 4;24(21):3987. doi: 10.3390/molecules24213987.Molecules. 2019.PMID:31690045Free PMC article.
Publication types
MeSH terms
Substances
Related information
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Molecular Biology Databases