- Home
- e-Journals
- Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication
- Volume 24,Issue 1
- Article

- ISSN 0929-9971
- E-ISSN: 1569-9994
HYPHEN
A flexible, hybrid method to map phenotype concept mentions to terminological resources
- Author(s):Paul Thompson1 andSophia Ananiadou1
- View AffiliationsHide AffiliationsAffiliations:1 University of Manchester
- Source:Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication,Volume 24, Issue 1,Jan 2018,p.91 - 121
- DOI:https://doi.org/10.1075/term.00015.tho
- Version of Record published :31 May 2018
Abstract
Narrative clinical records and biomedical articles constitute rich sources of information aboutphenotypes, i.e., markers distinguishing individuals with specific medical conditions from the general population. Phenotypes help clinicians to provide personalised treatments. However, locating information about them within huge document repositories is difficult, since each phenotypic concept can be mentioned in many ways. Normalisation methods automaticallymap divergent phrases to unique concepts in domain-specific terminologies, to allow location and linking of all mentions of a concept of interest. We have developed a hybrid normalisation method (HYPHEN) to handle concept mentions with wide ranging characteristics, across different text types. HYPHEN integrates various normalisation techniques that handlesurface-level variations (e.g., differences in word order, word forms or acronyms/abbreviations) andlexical-level variations (where terms have similarmeanings, but potentially unrelatedforms). HYPHEN achieves robust performance for both biomedical academic text and narrative clinical records, and has the ability to significantly outperform related methods.