'HypothesisFinder:' a strategy for the detection of speculative statements in scientific text
- PMID:23935466
- PMCID: PMC3723489
- DOI: 10.1371/journal.pcbi.1003117
'HypothesisFinder:' a strategy for the detection of speculative statements in scientific text
Abstract
Speculative statements communicating experimental findings are frequently found in scientific articles, and their purpose is to provide an impetus for further investigations into the given topic. Automated recognition of speculative statements in scientific text has gained interest in recent years as systematic analysis of such statements could transform speculative thoughts into testable hypotheses. We describe here a pattern matching approach for the detection of speculative statements in scientific text that uses a dictionary of speculative patterns to classify sentences as hypothetical. To demonstrate the practical utility of our approach, we applied it to the domain of Alzheimer's disease and showed that our automated approach captures a wide spectrum of scientific speculations on Alzheimer's disease. Subsequent exploration of derived hypothetical knowledge leads to generation of a coherent overview on emerging knowledge niches, and can thus provide added value to ongoing research activities.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures






Similar articles
- Linking hypothetical knowledge patterns to disease molecular signatures for biomarker discovery in Alzheimer's disease.Malhotra A, Younesi E, Bagewadi S, Hofmann-Apitius M.Malhotra A, et al.Genome Med. 2014 Dec 3;6(11):97. doi: 10.1186/s13073-014-0097-z. eCollection 2014.Genome Med. 2014.PMID:25484918Free PMC article.
- Recognizing speculative language in biomedical research articles: a linguistically motivated perspective.Kilicoglu H, Bergler S.Kilicoglu H, et al.BMC Bioinformatics. 2008 Nov 19;9 Suppl 11(Suppl 11):S10. doi: 10.1186/1471-2105-9-S11-S10.BMC Bioinformatics. 2008.PMID:19025686Free PMC article.
- Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P.Crider K, et al.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.Cochrane Database Syst Rev. 2022.PMID:36321557Free PMC article.
- Status of text-mining techniques applied to biomedical text.Erhardt RA, Schneider R, Blaschke C.Erhardt RA, et al.Drug Discov Today. 2006 Apr;11(7-8):315-25. doi: 10.1016/j.drudis.2006.02.011.Drug Discov Today. 2006.PMID:16580973Review.
- Online tools to support literature-based discovery in the life sciences.Weeber M, Kors JA, Mons B.Weeber M, et al.Brief Bioinform. 2005 Sep;6(3):277-86. doi: 10.1093/bib/6.3.277.Brief Bioinform. 2005.PMID:16212775Review.
Cited by
- Data-driven classification of the certainty of scholarly assertions.Prieto M, Deus H, de Waard A, Schultes E, García-Jiménez B, Wilkinson MD.Prieto M, et al.PeerJ. 2020 Apr 21;8:e8871. doi: 10.7717/peerj.8871. eCollection 2020.PeerJ. 2020.PMID:32341891Free PMC article.
- Biomedical text mining for research rigor and integrity: tasks, challenges, directions.Kilicoglu H.Kilicoglu H.Brief Bioinform. 2018 Nov 27;19(6):1400-1414. doi: 10.1093/bib/bbx057.Brief Bioinform. 2018.PMID:28633401Free PMC article.
- Linking hypothetical knowledge patterns to disease molecular signatures for biomarker discovery in Alzheimer's disease.Malhotra A, Younesi E, Bagewadi S, Hofmann-Apitius M.Malhotra A, et al.Genome Med. 2014 Dec 3;6(11):97. doi: 10.1186/s13073-014-0097-z. eCollection 2014.Genome Med. 2014.PMID:25484918Free PMC article.
- PDON: Parkinson's disease ontology for representation and modeling of the Parkinson's disease knowledge domain.Younesi E, Malhotra A, Gündel M, Scordis P, Kodamullil AT, Page M, Müller B, Springstubbe S, Wüllner U, Scheller D, Hofmann-Apitius M.Younesi E, et al.Theor Biol Med Model. 2015 Sep 22;12:20. doi: 10.1186/s12976-015-0017-y.Theor Biol Med Model. 2015.PMID:26395080Free PMC article.
- Systematic review automation technologies.Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E.Tsafnat G, et al.Syst Rev. 2014 Jul 9;3:74. doi: 10.1186/2046-4053-3-74.Syst Rev. 2014.PMID:25005128Free PMC article.
References
- Medlock B (2008) Exploring hedge identification in biomedical literature. Journal of biomedical informatics 41: 636–654. - PubMed
- Zhang S, Zhao H, Zhou G, Lu B (2010) Exploiting rich syntactic features for hedge detection and scope finding. Proceedings of the Fourteenth Conference on Computational Natural Language Learning 2010: 92–99.
- Light M, Qiu XY, Srinivasan P (2004) The language of bioscience: facts, speculations and statements in between. BioLINK 2004: Linking Biological Literature, Ontologies and Databases 2004: 17–24.
- Medlock B, Briscoe T (2007) Weakly supervised learning for hedge classification in scientific literature. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics 2007: 992–999.
- Szarvas G (2008) Hedge classification in biomedical texts with a weakly supervised selection of keywords. Proceedings of 46th Meeting of the Association for Computational Linguistics 2008: 281–289.
Publication types
MeSH terms
Related information
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources