- Marco Notaro ORCID:orcid.org/0000-0003-4309-220019,
- Max Schubach ORCID:orcid.org/0000-0002-2032-667920,
- Marco Frasca ORCID:orcid.org/0000-0002-4170-092219,
- Marco Mesiti ORCID:orcid.org/0000-0001-5701-008019,
- Peter N. Robinson ORCID:orcid.org/0000-0002-0736-919921 &
- …
- Giorgio Valentini ORCID:orcid.org/0000-0002-5694-391919
Part of the book series:Lecture Notes in Computer Science ((LNBI,volume 10834))
Included in the following conference series:
514Accesses
Abstract
The Human Phenotype Ontology (HPO) provides a standard categorization of the phenotypic abnormalities encountered in human diseases and of the semantic relationship between them. Quite surprisingly the problem of the automated prediction of the association between genes and abnormal human phenotypes has been widely overlooked, even if this issue represents an important step toward the characterization of gene-disease associations, especially when no or very limited knowledge is available about the genetic etiology of the disease under study. We present a novel ensemble method able to capture the hierarchical relationships betweenHPO terms, and able to improve existing hierarchical ensemble algorithms by explicitly considering the predictions of the descendant terms of the ontology. In this way the algorithm exploits the information embedded in the most specific ontology terms that closely characterize the phenotypic information associated with each human gene. Genome-wide results obtained by integrating multiple sources of information show the effectiveness of the proposed approach.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amberger, J., Bocchini, C., Amosh, A.: A new face and new challenges for online mendelian inheritance in man (OMIM). Hum. Mutat.32, 564–7 (2011)
Ashburner, M., et al.: Creating the gene ontology resource: design and implementation. Genome Res.11(8), 1425–1433 (2001)
Bolstad, B.M., Irizarry, R.A., Astrand, M., Speed, T.P.: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics19, 185–193 (2003)
Chatr-Aryamontri, A., et al.: The BioGRID interaction database: 2013 update. Nucleic Acids Res.41, 816–823 (2013)
Cormen, T., Leiserson, C., Rivest, R.L., Stein, S.: Introduction to Algorithms. MIT Press, Boston (2009)
Franceschini, A., et al.: STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res.41, 808–815 (2013)
Goldstein, B., Polley, E., Briggs, F.: Random forests for genetic association studies. Stat. Appl. Genet. Mol. Biol.10(1) (2011).https://doi.org/10.2202/1544-6115.1691
Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol.17, 184 (2016)
Kohler, S., Vasilevsky, N., Engelstad, M., et al.: The human phenotype ontology in 2017. Nucleic Acids Res.45, D865 (2017)
Moreau, Y., Tranchevent, L.: Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nature Rev. Genet.13, 523–536 (2012)
Notaro, M., Schubach, M., Robinson, P.N., Valentini, G.: Prediction of human phenotype ontology terms by means of hierarchical ensemble methods. BMC Bioinform.18(1), 449:1–449:18 (2017).http://dblp.uni-trier.de/db/journals/bmcbi/bmcbi18.html#NotaroSRV17
Re, M., Mesiti, M., Valentini, G.: A fast ranking algorithm for predicting gene functions in biomolecular networks. IEEE/ACM Trans. Comput. Biol. Bioinf.9, 1812–1818 (2012)
Robinson, P.N., Frasca, M., Köhler, S., Notaro, M., Re, M., Valentini, G.: A hierarchical ensemble method for DAG-structured taxonomies. In: Schwenker, F., Roli, F., Kittler, J. (eds.) MCS 2015. LNCS, vol. 9132, pp. 15–26. Springer, Cham (2015).https://doi.org/10.1007/978-3-319-20248-8_2
Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE10, 1–21 (2015)
Schubach, M., Re, M., Robinson, P., Valentini, G.: Imbalance-aware machine learning for predicting rare and common disease-associated non-coding variants. Sci. Rep.7(2959) (2017).https://doi.org/10.1038/s41598-017-03011-5
Smedley, D., et al.: A whole-genome analysis framework for effective identification of pathogenic regulatory variants in Mendelian disease. Am. J. Hum. Genet.99, 595–606 (2016)
Valentini, G.: True Path Rule hierarchical ensembles for genome-wide gene function prediction. IEEE/ACM Trans. Comput. Biol. Bioinf.8, 832–847 (2011)
Valentini, G., Armano, G., Frasca, M., Lin, J., Mesiti, M., Re, M.: RANKS: a flexible tool for node label ranking and classification in biological networks. Bioinformatics32, 2872 (2016)
Valentini, G., Köhler, S., Re, M., Notaro, M., Robinson, P.N.: Prediction of human gene - phenotype associations by exploiting the hierarchical structure of the human phenotype ontology. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015. LNCS, vol. 9043, pp. 66–77. Springer, Cham (2015).https://doi.org/10.1007/978-3-319-16483-0_7
Valentini, G., Paccanaro, A., Caniza, H., Romero, A., Re, M.: An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif. Intell. Med.61, 63–78 (2014)
Wang, P., et al.: Inference of gene-phenotype associations via protein-protein interaction and orthology. PLoS ONE8, 1–8 (2013)
Zemojtel, T., et al.: Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci. Transl. Med.6, 252ra123 (2014)
Acknowledgments
We acknowledge partial support from the project “Discovering Patterns in Multi-Dimensional Data” (2016–2017) funded by Università degli Studi di Milano.
Author information
Authors and Affiliations
Anacleto Lab – Dipartimento di Informatica, Università degli Studi di Milano, Via Celoria 18, 20135, Milano, Italy
Marco Notaro, Marco Frasca, Marco Mesiti & Giorgio Valentini
Berlin Institute of Health (BIH), Anna-Louisa-Karsch-Str. 2, 10178, Berlin, Germany
Max Schubach
The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr, Farmington, CT, 06032, USA
Peter N. Robinson
- Marco Notaro
You can also search for this author inPubMed Google Scholar
- Max Schubach
You can also search for this author inPubMed Google Scholar
- Marco Frasca
You can also search for this author inPubMed Google Scholar
- Marco Mesiti
You can also search for this author inPubMed Google Scholar
- Peter N. Robinson
You can also search for this author inPubMed Google Scholar
- Giorgio Valentini
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toGiorgio Valentini.
Editor information
Editors and Affiliations
University of Cagliari, Cagliari, Italy
Massimo Bartoletti
University of Genova, Genoa, Italy
Annalisa Barla
University of Stirling, Stirling, UK
Andrea Bracciali
Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany
Gunnar W. Klau
Houston Methodist Research Institute, Houston, TX, USA
Leif Peterson
University of Udine, Udine, Italy
Alberto Policriti
University of Salerno, Fisciano, Italy
Roberto Tagliaferri
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Notaro, M., Schubach, M., Frasca, M., Mesiti, M., Robinson, P.N., Valentini, G. (2019). Ensembling Descendant Term Classifiers to Improve Gene - Abnormal Phenotype Predictions. In: Bartoletti, M.,et al. Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2017. Lecture Notes in Computer Science(), vol 10834. Springer, Cham. https://doi.org/10.1007/978-3-030-14160-8_8
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-14159-2
Online ISBN:978-3-030-14160-8
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative