Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Multi-task Biomedical Overlapping and Nested Information Extraction Model Based on Unified Framework

  • Conference paper
  • First Online:

Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 14303))

  • 1438Accesses

Abstract

Biomedical information extraction technology can mine crucial biomedical knowledge from biomedical literature automatically, which plays an important role in constructing knowledge graphs and databases. Biomedical named entity recognition and event detection are crucial tasks in biomedical information extraction. Previous research mainly focused on building models for single tasks, with low generalization ability. Moreover, most prior work neglected the detection of overlapping events and the recognition of nested entities. Although some models can solve these problems, they suffer from low efficiency and poor performance. Therefore, we propose a unified framework multi-task MBONIEUF model for biomedical entity recognition and event detection. Our model converts sequence labeling problems into machine reading comprehension problems, first using ChatGPT to generate semantically rich questions based on the biomedical corpus labels, and then encoding the concatenated generated questions and original sentences using PubMedBERT. Furthermore, BiGRU, Biaffine attention module, and IDCNN module are designed as the feature extraction layer, used to capture the complex interactions between token pairs and the features between events and entities. In the decoding layer, a stacked classification pointer matrix and a multi-head nested query matrix are learned, and a corresponding detection algorithm is designed to decode the entities and events in the two matrices. We propose a model that can handle both biomedical named entity recognition and event detection tasks. Compared with traditional methods, our model not only improves the model’s task generalization ability but also performs better in solving the overlapping and nested problems in biomedical information extraction. On the MLEE, BioNLP’09, and GENIA datasets, the proposed model achieves good performances with F1 scores of 81.62%, 75.76%, and 77.75% respectively.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 13727
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 17159
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

References

  1. Kim, J.D., Ohta, T., Tateisi, Y., Tsujii, J.: GENIA corpus—a semantically annotated corpus for bio-text mining. In: ISMB (Supplement of Bioinformatics), pp.180–182 (2003)

    Google Scholar 

  2. Souza, F., Nogueira, R., Lotufo, R.: Portuguese named entity recognition using BERT-CRF.arXiv:1909.10649 (2019)

  3. Ma, X., Eduard, H.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF.arXiv:1603.01354 (2016)

  4. Tang, D., Qin, B., Feng, X., Liu, T.: Effective LSTMs for target-dependent sentiment classification.arXiv:1512.01100 (2015)

  5. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition.arXiv:1603.01360 (2016)

  6. Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., Li, J.: A unified MRC framework for named entity recognition.arXiv:1910.11476 (2019)

  7. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing.arXiv:2004.13922 (2020)

  8. Asai, A., Hashimoto, K., Hajishirzi, H., Socher, R., Xiong, C.: Learning to retrieve reasoning paths over Wikipedia graph for question answering.arXiv:1911.10470 (2019)

  9. OpenAI. “ChatGPT: Language Model by OpenAI.“ OpenAI Blog.https://openai.com/blog/chatgpt/. Accessed 15 May 2023

  10. Pyysalo, S., Ohta, T., Miwa, M., Cho, H.C., Tsujii, J.I., Ananiadou, S.: Event extraction across multiple levels of biological organization. Bioinformatics28(18), i575–i581 (2012)

    Article  Google Scholar 

  11. Kim, J.D., Ohta, T., Tsujii, J.I.: Corpus annotation for mining biomedical events from literature. BMC Bioinformatics9, 1–25 (2008)

    Article  Google Scholar 

  12. Lu, W., Roth, D.: Joint mention extraction and classification with mention hypergraphs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 857–867. ACL, Lisbon, Portugal (2015)

    Google Scholar 

  13. Muis, A.O., Lu, W.: Labeling gaps between words: Recognizing overlapping mentions with mention separators.arXiv:1810.09073 (2018)

  14. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding.arXiv:1810.04805 (2018)

  15. Lin, Y., Ji, H., Huang, F., Wu, L.: A joint neural model for information extraction with global features. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7999–8009. ACL, Online (2020)

    Google Scholar 

  16. Bekoulis, G., Deleu, J., Demeester, T., Develder, C.: Joint entity recognition and relation extraction as a multi-head selection problem. Expert Syst. Appl.114, 34–45 (2018)

    Article  Google Scholar 

  17. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthcare (HEALTH),3(1), 1–23 (2021)

    Google Scholar 

  18. Dey, R., Salem, F.M.: Gate-variants of gated recurrent unit (GRU) neural networks. In: Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1597–1600 (2017)

    Google Scholar 

  19. Fang, Y., Gao, J., Liu, Z., Huang, C.: Detecting cyber threat events from Twitter using IDCNN and BiLSTM. Appl. Sci.10(17), 5922 (2020)

    Article  Google Scholar 

  20. Wei, Z., Su, J., Wang, Y., Tian, Y., Chang, Y.: A novel cascade binary tagging framework for relational triple extraction.arXiv:1909.03227 (2019)

  21. Zhang, Z., Sabuncu, M.: Generalized cross-entropy loss for training deep neural networks with noisy labels. Adv. Neural Inform. Process. Syst.31 (2018)

    Google Scholar 

  22. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988. (2017)

    Google Scholar 

  23. Zheng, C., Cai, Y., Xu, J., Leung, H. F., Xu, G.: A boundary-aware neural model for nested named entity recognition. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics (2019)

    Google Scholar 

  24. Tan, C., Qiu, W., Chen, M., Wang, R., Huang, F.: Boundary-enhanced neural span classification for nested named entity recognition. Proc. AAAI Conf. Artifi. Intell.34(05), 9016–9023 (2020)

    Google Scholar 

  25. Nie, Y., Rong, W., Zhang, Y., Ouyang, Y., Xiong, Z.: Embedding-assisted prediction architecture for event trigger identification. J. Bioinform. Comput. Biol.13(03), 1541001 (2015)

    Article  Google Scholar 

  26. Shen, C., et al.: Biomedical event trigger detection with convolutional highway neural network and extreme learning machine. Appl. Soft Comput.84, 105661 (2019)

    Article  Google Scholar 

  27. Wei, H., Zhou, Ai., Zhang, Y., Chen, F., Wen, Qu., Mingyu, Lu.: Biomedical event trigger extraction based on multi-layer residual BiLSTM and contextualized word representations. Int. J. Mach. Learn. Cybern.13(3), 721–733 (2021).https://doi.org/10.1007/s13042-021-01315-7

    Article  Google Scholar 

  28. Majumder, A.: Multiple features-based approach to extract bio-molecular event triggers using conditional random field. Int. J. Intell. Syst. Appl.4(12), 41 (2012)

    Google Scholar 

  29. Martinez, D., Baldwin, T.: Word sense disambiguation for event trigger word detection in biomedicine. BMC Bioinformatics,12(2), 1–8, BioMed Central (2011)

    Google Scholar 

  30. Wang, J., Wu, Y., Lin, H., Yang, Z.: Biological event trigger extraction based on deep parsing. Comput. Eng.39, 25–30 (2013)

    Google Scholar 

  31. Li, L., Liu, S., Qin, M., Wang, Y., Huang, D.: Extracting biomedical events with dual decomposition integrating word embeddings. IEEE/ACM Trans. Comput. Biol. Bioinf.13(4), 669–677 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 62006108), Postdoctoral Research Foundation of China (No. 2022M710593), Liaoning Provincial Science and Technology Fund project (No. 2021-BS-201), Liaoning Province General Higher Education Undergraduate Teaching Reform Research Project (Liaoning Education Office [2022] No. 160), Liaoning Normal University Undergraduate Teaching Reform Research and Practice Project (No. LSJG202210).

Author information

Authors and Affiliations

  1. School of Computer and Artificial Intelligence, Liaoning Normal University, Dalian, China

    Xinyu He, Shixin Li, Guangda Zhao, Xue Han & Qiangjian Zhuang

  2. Information and Communication Engineering Postdoctoral Research Station, Dalian University of Technology, Dalian, China

    Xinyu He

  3. Postdoctoral Workstation of Dalian Yongjia Electronic Technology Co., Ltd., Dalian, China

    Xinyu He

Authors
  1. Xinyu He

    You can also search for this author inPubMed Google Scholar

  2. Shixin Li

    You can also search for this author inPubMed Google Scholar

  3. Guangda Zhao

    You can also search for this author inPubMed Google Scholar

  4. Xue Han

    You can also search for this author inPubMed Google Scholar

  5. Qiangjian Zhuang

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toXinyu He.

Editor information

Editors and Affiliations

  1. Emory University, Atlanta, GA, USA

    Fei Liu

  2. Microsoft Research Asia, Beijing, China

    Nan Duan

  3. Soochow University, Suzhou, China

    Qingting Xu

  4. Soochow University, Suzhou, China

    Yu Hong

Rights and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, X., Li, S., Zhao, G., Han, X., Zhuang, Q. (2023). Multi-task Biomedical Overlapping and Nested Information Extraction Model Based on Unified Framework. In: Liu, F., Duan, N., Xu, Q., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2023. Lecture Notes in Computer Science(), vol 14303. Springer, Cham. https://doi.org/10.1007/978-3-031-44696-2_21

Download citation

Publish with us

Societies and partnerships

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 13727
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 17159
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp