Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Echo State vs. LSTM Networks for Word Sense Disambiguation

  • Conference paper
  • First Online:

Abstract

Inspired by bidirectional long short-term memory (LSTM) recurrent neural network (RNN) architectures, commonly applied in natural language processing (NLP) tasks, we have investigated an alternative bidirectional RNN structure consisting of two Echo state networks (ESN). Like the widely applied BiLSTM architectures, the BiESN structure accumulates information from both the left and right contexts of target word, thus accounting for all available information within the text. The main advantages of BiESN over BiLSTM are the smaller number of trainable parameters and a simpler training algorithm. The two modelling approaches have been compared on the word sense disambiguation task (WSD) in NLP. The accuracy of several BiESN architectures is compared with that of similar BiLSTM models trained and evaluated on the same data sets.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

References

  1. Butcher, J.B., Verstraeten, D., Schrauwen, B., Day, C.R., Haycock, P.W.: Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural Netw.38, 76–89 (2013).https://doi.org/10.1016/j.neunet.2012.11.011

    Article  Google Scholar 

  2. Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. In: Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111. Association for Computational Linguistics, Doha, October 2014.https://doi.org/10.3115/v1/W14-4012

  3. Edmonds, P., Cotton, S.: SENSEVAL-2: overview. In: The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, SENSEVAL 2001, pp. 1–5. Association for Computational Linguistics, Stroudsburg (2001).http://dl.acm.org/citation.cfm?id=2387364.2387365

  4. Fellbaum, C.: WordNet. In: Poli, R., Healy, M., Kameas, A. (eds.) Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Dordrecht (2010).https://doi.org/10.1007/978-90-481-8847-5_10

    Chapter  Google Scholar 

  5. Frank, S.L., Čerňanský, M.P.: Generalization and systematicity in echo state networks. In: The Annual Meeting of the Cognitive Science Society, pp. 733–738 (2008)

    Google Scholar 

  6. Gallicchio, C., Micheli, A.: A reservoir computing approach for human gesture recognition from kinect data. In: Proceedings of the AI for Ambient Assisted Living (2016)

    Google Scholar 

  7. Gallicchio, C., Micheli, A., Pedrelli, L.: Comparison between DeepESNs and gated RNNs on multivariate time-series prediction. CoRR (2018).http://arxiv.org/abs/1812.11527

  8. Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. SCI, vol. 385. Springer, Heidelberg (2012).https://doi.org/10.1007/978-3-642-24797-2

    Book MATH  Google Scholar 

  9. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw.18(5–6), 602–610 (2005).https://doi.org/10.1016/j.neunet.2005.06.042

    Article  Google Scholar 

  10. Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst.28(10), 2222–2232 (2017).https://doi.org/10.1109/TNNLS.2016.2582924

    Article MathSciNet  Google Scholar 

  11. Hinaut, X., Dominey, P.F.: Real-time parallel processing of grammatical structure in the fronto-striatal system: a recurrent network simulation study using reservoir computing. PLOS ONE8(2), 1–18 (2013).https://doi.org/10.1371/journal.pone.0052946

    Article  Google Scholar 

  12. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput.9(8), 1735–1780 (1997).https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  13. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR (2015).http://arxiv.org/abs/1508.01991

  14. Jaeger, H.: Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach. GMD Report 159, German National Research Center for Information Technology (2002)

    Google Scholar 

  15. Kågebäck, M., Salomonsson, H.: Word sense disambiguation using a bidirectional LSTM. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), pp. 51–56. The COLING 2016 Organizing Committee, Osaka, December 2016.https://www.aclweb.org/anthology/W16-5307

  16. Koprinkova-Hristova, P., Popov, A., Simov, K., Osenova, P.: Echo state network for word sense disambiguation. In: Proceedings of the Artificial Intelligence: Methodology, Systems, and Applications - 18th International Conference, AIMSA 2018, Varna, Bulgaria, 12–14 September 2018, pp. 73–82 (2018).https://doi.org/10.1007/978-3-319-99344-7_7

    Chapter  Google Scholar 

  17. Lukosevicius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev.3, 127–149 (2009).https://doi.org/10.1016/j.cosrev.2009.03.005

    Article MATH  Google Scholar 

  18. Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput.14(11), 2531–2560 (2002).https://doi.org/10.1162/089976602760407955

    Article MATH  Google Scholar 

  19. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR (2013).https://arxiv.org/abs/1301.3781

  20. Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the Workshop on Human Language Technology, HLT 1993, pp. 303–308. Association for Computational Linguistics, Stroudsburg (1993).https://doi.org/10.3115/1075671.1075742

  21. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, October 2014.https://doi.org/10.3115/v1/D14-1162

  22. Popov, A.: Word sense disambiguation with recurrent neural networks. In: Proceedings of the Student Research Workshop Associated with RANLP 2017, pp. 25–34. INCOMA Ltd., Varna, September 2017.https://doi.org/10.26615/issn.1314-9156.2017_004

  23. Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 99–110. Association for Computational Linguistics, Valencia (2017).https://www.aclweb.org/anthology/E17-1010

  24. Rodan, A., Sheta, A.F., Faris, H.: Bidirectional reservoir network strained using SVM+ privileged information for manufacturing process modeling. Soft Comput.21(22), 6811–6824 (2017).https://doi.org/10.1007/s00500-016-2232-9

    Article  Google Scholar 

  25. Simov, K., Osenova, P., Popov, A.: Comparison of word embeddings from different knowledge graphs. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 213–221. Springer, Cham (2017).https://doi.org/10.1007/978-3-319-59888-8_19

    Chapter  Google Scholar 

  26. Skowronski, M., Harris, J.: Minimum mean squared error time series classification using an echo state network prediction model. In: 2006 IEEE International Symposium on Circuits and Systems. IEEE (2006).https://doi.org/10.1109/ISCAS.2006.1693294

  27. Squartini, S., Cecchi, S., Rossini, M., Piazza, F.: Echo state networks for real-time audio applications. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007. LNCS, vol. 4493, pp. 731–740. Springer, Heidelberg (2007).https://doi.org/10.1007/978-3-540-72395-0_90

    Chapter  Google Scholar 

  28. Tong, M.H., Bickett, A.D., Christiansen, E.M., Cottrell, G.W.: Learning grammatical structure with echo state networks. Neural Netw.20(3), 424–432 (2007).https://doi.org/10.1016/j.neunet.2007.04.013

    Article MATH  Google Scholar 

  29. Twiefel, J., Hinaut, X., Soares, M.B., Strahl, E., Wermter, S.: Using natural language feedback in a neuro-inspired integrated multimodal robotic architecture. In: 25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016, New York, NY, USA, 26–31 August 2016, pp. 52–57 (2016).https://doi.org/10.1109/ROMAN.2016.7745090

  30. Twiefel, J., Hinaut, X., Wermter, S.: Semantic role labelling for robot instructions using echo state networks. In: 24th European Symposium on Artificial Neural Networks, ESANN 2016, Bruges, Belgium, 27–29 April 2016 (2016).http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-168.pdf

  31. Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. CoRR (2015).http://arxiv.org/abs/1510.06168

  32. Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: A unified tagging solution: bidirectional LSTM recurrent neural network with word embedding. CoRR (2015).http://arxiv.org/abs/1511.00215

  33. Wang, W., Chang, B.: Graph-based dependency parsing with bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 2306–2315. Association for Computational Linguistics, Berlin, August 2016.https://doi.org/10.18653/v1/P16-1218

Download references

Acknowledgments

This research has been funded by the Bulgarian National Science Fund grant number 02/12/2016—Deep Models of Semantic Knowledge (DemoSem) and was partially supported by the National Scientific Program “Information and Communication Technologies for a Single Digital Market in Science, Education and Security (ICTinSES)”, financed by the Ministry of Education and Science. Alexander Popov was also partially supported by the Bulgarian Ministry of Education and Science under the National Research Programme “Young scientists and postdoctoral students” approved by DCM # 577/17.08.2018.

Author information

Authors and Affiliations

  1. IICT, Bulgarian Academy of Sciences, Sofia, Bulgaria

    Alexander Popov, Petia Koprinkova-Hristova, Kiril Simov & Petya Osenova

Authors
  1. Alexander Popov

    You can also search for this author inPubMed Google Scholar

  2. Petia Koprinkova-Hristova

    You can also search for this author inPubMed Google Scholar

  3. Kiril Simov

    You can also search for this author inPubMed Google Scholar

  4. Petya Osenova

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toPetia Koprinkova-Hristova.

Editor information

Editors and Affiliations

  1. Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany

    Igor V. Tetko

  2. Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic

    Věra Kůrková

  3. Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany

    Pavel Karpov

  4. Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany

    Fabian Theis

Rights and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Popov, A., Koprinkova-Hristova, P., Simov, K., Osenova, P. (2019). Echo State vs. LSTM Networks for Word Sense Disambiguation. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_10

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp