Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Weakly-Supervised Symptom Recognition for Rare Diseases in Biomedical Text

  • Conference paper
  • First Online:

Abstract

In this paper, we tackle the issue of symptom recognition for rare diseases in biomedical texts. Symptoms typically have more complex and ambiguous structure than other biomedical named entities. Furthermore, existing resources are scarce and incomplete. Therefore, we propose a weakly-supervised framework based on a combination of two approaches: sequential pattern mining under constraints and sequence labeling. We use unannotated biomedical paper abstracts with dictionaries of rare diseases and symptoms to create our training data. Our experiments show that both approaches outperform simple projection of the dictionaries on text, and their combination is beneficial. We also introduce a novel pattern mining constraint based on semantic similarity between words inside patterns.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3–14 (1995)

    Google Scholar 

  2. Béchet, N., Cellier, P., Charnois, T., Crémilleux, B.: Sequence mining under multiple constraints. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, pp. 908–914 (2015)

    Google Scholar 

  3. Cohen, K.B.: BioNLP: biomedical text mining. In: Handbook of Natural Language Processing, 2nd edn. (2010)

    Google Scholar 

  4. Doğan, R.I., Leaman, R., Lu, Z.: NCBI disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inf.47, 1–10 (2014)

    Article  Google Scholar 

  5. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)

    Google Scholar 

  6. Harris, Z.S.: Distributional structure. Word10(2–3), 146–162 (1954)

    Article  Google Scholar 

  7. Kokkinakis, D.: Developing resources for swedish bio-medical text mining. In: Proceedings of the 2nd International Symposium on Semantic Mining in Biomedicine (SMBM) (2006)

    Google Scholar 

  8. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)

    Google Scholar 

  9. Leaman, R., Miller, C., Gonzalez, G.: Enabling recognition of diseases in biomedical text with machine learning: corpus and benchmark. In: Proceedings of the 2009 Symposium on Languages in Biology and Medicine, vol. 82(9) (2009)

    Google Scholar 

  10. Martin, L., Battistelli, D., Charnois, T.: Symptom extraction issue. In: Proceedings of BioNLP 2014, pp. 107–111 (2014)

    Google Scholar 

  11. Métivier, J.P., Serrano, L., Charnois, T., Cuissart, B., Widlöcher, A.: Automatic symptom extraction from texts to enhance knowledge discovery on rare diseases. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) Artificial Intelligence in Medicine. LNCS, vol. 9105, pp. 249–254. Springer, Heidelberg (2015). doi:10.1007/978-3-319-19551-3_33

    Chapter  Google Scholar 

  12. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprintarXiv:1301.3781 (2013)

  13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  14. Pei, J., Han, J., Wang, W.: Constraint-based sequential pattern mining: the pattern-growth methods. J. Intell. Inf. Syst.28(2), 133–160 (2007)

    Article  Google Scholar 

  15. Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical text analysis and knowledge extraction system (ctakes): architecture, component evaluation and applications. J. Am. Med. Inf. Assoc.17(5), 507–513 (2010)

    Article  Google Scholar 

  16. South, B.R., Shen, S., Jones, M., Garvin, J., Samore, M.H., Chapman, W.W., Gundlapalli, A.V.: Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease. BMC Bioinform.10(9), 1 (2009)

    Google Scholar 

  17. Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, pp. 3–17 (1996)

    Google Scholar 

  18. Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inf. Assoc.18(5), 552–556 (2011)

    Article  Google Scholar 

  19. Wagholikar, K.B., Torii, M., Jonnalagadda, S.R., Liu, H.: Pooling annotated corpora for clinical concept extraction. J. Biomed. Semant.4(1), 1–10 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the French National Research Agency (ANR) as part of the project Hybride ANR-11-BS02-002 and the “Investissements d’Avenir” program (reference: ANR-10-LABX-0083).

Author information

Authors and Affiliations

  1. LIPN, University of Paris 13, Sorbonne Paris Cité, Paris, France

    Pierre Holat, Nadi Tomeh & Thierry Charnois

  2. MoDyCo, University of Paris Ouest Nanterre La Défense, Paris, France

    Delphine Battistelli

  3. Inserm, Paris, France

    Marie-Christine Jaulent

  4. GREYC, University of Caen Basse-Normandie, Caen, France

    Jean-Philippe Métivier

Authors
  1. Pierre Holat

    You can also search for this author inPubMed Google Scholar

  2. Nadi Tomeh

    You can also search for this author inPubMed Google Scholar

  3. Thierry Charnois

    You can also search for this author inPubMed Google Scholar

  4. Delphine Battistelli

    You can also search for this author inPubMed Google Scholar

  5. Marie-Christine Jaulent

    You can also search for this author inPubMed Google Scholar

  6. Jean-Philippe Métivier

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toPierre Holat.

Editor information

Editors and Affiliations

  1. Stockholm University , Stockholm, Sweden

    Henrik Boström

  2. Leiden University , Leiden, The Netherlands

    Arno Knobbe

  3. University of Porto , Porto, Portugal

    Carlos Soares

  4. Stockholm University , Stockholm, Sweden

    Panagiotis Papapetrou

Rights and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Holat, P., Tomeh, N., Charnois, T., Battistelli, D., Jaulent, MC., Métivier, JP. (2016). Weakly-Supervised Symptom Recognition for Rare Diseases in Biomedical Text. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XV. IDA 2016. Lecture Notes in Computer Science(), vol 9897. Springer, Cham. https://doi.org/10.1007/978-3-319-46349-0_17

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp