Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text
- PMID:19208194
- PMCID: PMC2646239
- DOI: 10.1186/1471-2105-10-S2-S6
Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text
Abstract
Background: Pharmacogenomics studies the relationship between genetic variation and the variation in drug response phenotypes. The field is rapidly gaining importance: it promises drugs targeted to particular subpopulations based on genetic background. The pharmacogenomics literature has expanded rapidly, but is dispersed in many journals. It is challenging, therefore, to identify important associations between drugs and molecular entities--particularly genes and gene variants, and thus these critical connections are often lost. Text mining techniques can allow us to convert the free-style text to a computable, searchable format in which pharmacogenomic concepts (such as genes, drugs, polymorphisms, and diseases) are identified, and important links between these concepts are recorded. Availability of full text articles as input into text mining engines is key, as literature abstracts often do not contain sufficient information to identify these pharmacogenomic associations.
Results: Thus, building on a tool called Textpresso, we have created the Pharmspresso tool to assist in identifying important pharmacogenomic facts in full text articles. Pharmspresso parses text to find references to human genes, polymorphisms, drugs and diseases and their relationships. It presents these as a series of marked-up text fragments, in which key concepts are visually highlighted. To evaluate Pharmspresso, we used a gold standard of 45 human-curated articles. Pharmspresso identified 78%, 61%, and 74% of target gene, polymorphism, and drug concepts, respectively.
Conclusion: Pharmspresso is a text analysis tool that extracts pharmacogenomic concepts from the literature automatically and thus captures our current understanding of gene-drug interactions in a computable form. We have made Pharmspresso available at http://pharmspresso.stanford.edu.
Figures






Similar articles
- Textpresso: an ontology-based information retrieval and extraction system for biological literature.Müller HM, Kenny EE, Sternberg PW.Müller HM, et al.PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.PLoS Biol. 2004.PMID:15383839Free PMC article.
- Recent progress in automatically extracting information from the pharmacogenomic literature.Garten Y, Coulet A, Altman RB.Garten Y, et al.Pharmacogenomics. 2010 Oct;11(10):1467-89. doi: 10.2217/pgs.10.136.Pharmacogenomics. 2010.PMID:21047206Free PMC article.Review.
- Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies.Pakhomov S, McInnes BT, Lamba J, Liu Y, Melton GB, Ghodke Y, Bhise N, Lamba V, Birnbaum AK.Pakhomov S, et al.J Biomed Inform. 2012 Oct;45(5):862-9. doi: 10.1016/j.jbi.2012.04.007. Epub 2012 May 4.J Biomed Inform. 2012.PMID:22564551Free PMC article.
- A mutation-centric approach to identifying pharmacogenomic relations in text.Rance B, Doughty E, Demner-Fushman D, Kann MG, Bodenreider O.Rance B, et al.J Biomed Inform. 2012 Oct;45(5):835-41. doi: 10.1016/j.jbi.2012.05.003. Epub 2012 Jun 7.J Biomed Inform. 2012.PMID:22683993Free PMC article.
- Progress towards the integration of pharmacogenomics in practice.Mooney SD.Mooney SD.Hum Genet. 2015 May;134(5):459-65. doi: 10.1007/s00439-014-1484-7. Epub 2014 Sep 11.Hum Genet. 2015.PMID:25238897Free PMC article.Review.
Cited by
- Automated Metabolic Phenotyping of Cytochrome Polymorphisms Using PubMed Abstract Mining.Chen L, Friedman C, Finkelstein J.Chen L, et al.AMIA Annu Symp Proc. 2018 Apr 16;2017:535-544. eCollection 2017.AMIA Annu Symp Proc. 2018.PMID:29854118Free PMC article.
- Recurrent neural networks for classifying relations in clinical notes.Luo Y.Luo Y.J Biomed Inform. 2017 Aug;72:85-95. doi: 10.1016/j.jbi.2017.07.006. Epub 2017 Jul 8.J Biomed Inform. 2017.PMID:28694119Free PMC article.
- Mining the pharmacogenomics literature--a survey of the state of the art.Hahn U, Cohen KB, Garten Y, Shah NH.Hahn U, et al.Brief Bioinform. 2012 Jul;13(4):460-94. doi: 10.1093/bib/bbs018.Brief Bioinform. 2012.PMID:22833496Free PMC article.
- Discovering drug-drug interactions: a text-mining and reasoning approach based on properties of drug metabolism.Tari L, Anwar S, Liang S, Cai J, Baral C.Tari L, et al.Bioinformatics. 2010 Sep 15;26(18):i547-53. doi: 10.1093/bioinformatics/btq382.Bioinformatics. 2010.PMID:20823320Free PMC article.
- Towards precision medicine: advances in computational approaches for the analysis of human variants.Peterson TA, Doughty E, Kann MG.Peterson TA, et al.J Mol Biol. 2013 Nov 1;425(21):4047-63. doi: 10.1016/j.jmb.2013.08.008. Epub 2013 Aug 17.J Mol Biol. 2013.PMID:23962656Free PMC article.Review.
References
- Ahlers CB, Fiszman M, Demner-Fushman D, Lang F, Rindflesch TC. Extracting semantic predications from medline citations for pharmacogenomics. Pac Symp Biocomput. 2007;12:205–208. - PubMed
Publication types
MeSH terms
Related information
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources