- Alexander Popov ORCID:orcid.org/0000-0001-7676-360012,
- Petia Koprinkova-Hristova ORCID:orcid.org/0000-0002-0447-966712,
- Kiril Simov ORCID:orcid.org/0000-0003-3555-017912 &
- …
- Petya Osenova ORCID:orcid.org/0000-0002-4484-502712
Part of the book series:Lecture Notes in Computer Science ((LNTCS,volume 11731))
Included in the following conference series:
5774Accesses
Abstract
Inspired by bidirectional long short-term memory (LSTM) recurrent neural network (RNN) architectures, commonly applied in natural language processing (NLP) tasks, we have investigated an alternative bidirectional RNN structure consisting of two Echo state networks (ESN). Like the widely applied BiLSTM architectures, the BiESN structure accumulates information from both the left and right contexts of target word, thus accounting for all available information within the text. The main advantages of BiESN over BiLSTM are the smaller number of trainable parameters and a simpler training algorithm. The two modelling approaches have been compared on the word sense disambiguation task (WSD) in NLP. The accuracy of several BiESN architectures is compared with that of similar BiLSTM models trained and evaluated on the same data sets.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 9723
- Price includes VAT (Japan)
- Softcover Book
- JPY 12154
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Butcher, J.B., Verstraeten, D., Schrauwen, B., Day, C.R., Haycock, P.W.: Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural Netw.38, 76–89 (2013).https://doi.org/10.1016/j.neunet.2012.11.011
Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. In: Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111. Association for Computational Linguistics, Doha, October 2014.https://doi.org/10.3115/v1/W14-4012
Edmonds, P., Cotton, S.: SENSEVAL-2: overview. In: The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, SENSEVAL 2001, pp. 1–5. Association for Computational Linguistics, Stroudsburg (2001).http://dl.acm.org/citation.cfm?id=2387364.2387365
Fellbaum, C.: WordNet. In: Poli, R., Healy, M., Kameas, A. (eds.) Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Dordrecht (2010).https://doi.org/10.1007/978-90-481-8847-5_10
Frank, S.L., Čerňanský, M.P.: Generalization and systematicity in echo state networks. In: The Annual Meeting of the Cognitive Science Society, pp. 733–738 (2008)
Gallicchio, C., Micheli, A.: A reservoir computing approach for human gesture recognition from kinect data. In: Proceedings of the AI for Ambient Assisted Living (2016)
Gallicchio, C., Micheli, A., Pedrelli, L.: Comparison between DeepESNs and gated RNNs on multivariate time-series prediction. CoRR (2018).http://arxiv.org/abs/1812.11527
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. SCI, vol. 385. Springer, Heidelberg (2012).https://doi.org/10.1007/978-3-642-24797-2
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw.18(5–6), 602–610 (2005).https://doi.org/10.1016/j.neunet.2005.06.042
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst.28(10), 2222–2232 (2017).https://doi.org/10.1109/TNNLS.2016.2582924
Hinaut, X., Dominey, P.F.: Real-time parallel processing of grammatical structure in the fronto-striatal system: a recurrent network simulation study using reservoir computing. PLOS ONE8(2), 1–18 (2013).https://doi.org/10.1371/journal.pone.0052946
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput.9(8), 1735–1780 (1997).https://doi.org/10.1162/neco.1997.9.8.1735
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR (2015).http://arxiv.org/abs/1508.01991
Jaeger, H.: Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach. GMD Report 159, German National Research Center for Information Technology (2002)
Kågebäck, M., Salomonsson, H.: Word sense disambiguation using a bidirectional LSTM. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), pp. 51–56. The COLING 2016 Organizing Committee, Osaka, December 2016.https://www.aclweb.org/anthology/W16-5307
Koprinkova-Hristova, P., Popov, A., Simov, K., Osenova, P.: Echo state network for word sense disambiguation. In: Proceedings of the Artificial Intelligence: Methodology, Systems, and Applications - 18th International Conference, AIMSA 2018, Varna, Bulgaria, 12–14 September 2018, pp. 73–82 (2018).https://doi.org/10.1007/978-3-319-99344-7_7
Lukosevicius, M., Jaeger, H.: Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev.3, 127–149 (2009).https://doi.org/10.1016/j.cosrev.2009.03.005
Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput.14(11), 2531–2560 (2002).https://doi.org/10.1162/089976602760407955
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR (2013).https://arxiv.org/abs/1301.3781
Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the Workshop on Human Language Technology, HLT 1993, pp. 303–308. Association for Computational Linguistics, Stroudsburg (1993).https://doi.org/10.3115/1075671.1075742
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, October 2014.https://doi.org/10.3115/v1/D14-1162
Popov, A.: Word sense disambiguation with recurrent neural networks. In: Proceedings of the Student Research Workshop Associated with RANLP 2017, pp. 25–34. INCOMA Ltd., Varna, September 2017.https://doi.org/10.26615/issn.1314-9156.2017_004
Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 99–110. Association for Computational Linguistics, Valencia (2017).https://www.aclweb.org/anthology/E17-1010
Rodan, A., Sheta, A.F., Faris, H.: Bidirectional reservoir network strained using SVM+ privileged information for manufacturing process modeling. Soft Comput.21(22), 6811–6824 (2017).https://doi.org/10.1007/s00500-016-2232-9
Simov, K., Osenova, P., Popov, A.: Comparison of word embeddings from different knowledge graphs. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 213–221. Springer, Cham (2017).https://doi.org/10.1007/978-3-319-59888-8_19
Skowronski, M., Harris, J.: Minimum mean squared error time series classification using an echo state network prediction model. In: 2006 IEEE International Symposium on Circuits and Systems. IEEE (2006).https://doi.org/10.1109/ISCAS.2006.1693294
Squartini, S., Cecchi, S., Rossini, M., Piazza, F.: Echo state networks for real-time audio applications. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007. LNCS, vol. 4493, pp. 731–740. Springer, Heidelberg (2007).https://doi.org/10.1007/978-3-540-72395-0_90
Tong, M.H., Bickett, A.D., Christiansen, E.M., Cottrell, G.W.: Learning grammatical structure with echo state networks. Neural Netw.20(3), 424–432 (2007).https://doi.org/10.1016/j.neunet.2007.04.013
Twiefel, J., Hinaut, X., Soares, M.B., Strahl, E., Wermter, S.: Using natural language feedback in a neuro-inspired integrated multimodal robotic architecture. In: 25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016, New York, NY, USA, 26–31 August 2016, pp. 52–57 (2016).https://doi.org/10.1109/ROMAN.2016.7745090
Twiefel, J., Hinaut, X., Wermter, S.: Semantic role labelling for robot instructions using echo state networks. In: 24th European Symposium on Artificial Neural Networks, ESANN 2016, Bruges, Belgium, 27–29 April 2016 (2016).http://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-168.pdf
Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. CoRR (2015).http://arxiv.org/abs/1510.06168
Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: A unified tagging solution: bidirectional LSTM recurrent neural network with word embedding. CoRR (2015).http://arxiv.org/abs/1511.00215
Wang, W., Chang, B.: Graph-based dependency parsing with bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 2306–2315. Association for Computational Linguistics, Berlin, August 2016.https://doi.org/10.18653/v1/P16-1218
Acknowledgments
This research has been funded by the Bulgarian National Science Fund grant number 02/12/2016—Deep Models of Semantic Knowledge (DemoSem) and was partially supported by the National Scientific Program “Information and Communication Technologies for a Single Digital Market in Science, Education and Security (ICTinSES)”, financed by the Ministry of Education and Science. Alexander Popov was also partially supported by the Bulgarian Ministry of Education and Science under the National Research Programme “Young scientists and postdoctoral students” approved by DCM # 577/17.08.2018.
Author information
Authors and Affiliations
IICT, Bulgarian Academy of Sciences, Sofia, Bulgaria
Alexander Popov, Petia Koprinkova-Hristova, Kiril Simov & Petya Osenova
- Alexander Popov
You can also search for this author inPubMed Google Scholar
- Petia Koprinkova-Hristova
You can also search for this author inPubMed Google Scholar
- Kiril Simov
You can also search for this author inPubMed Google Scholar
- Petya Osenova
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toPetia Koprinkova-Hristova.
Editor information
Editors and Affiliations
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Popov, A., Koprinkova-Hristova, P., Simov, K., Osenova, P. (2019). Echo State vs. LSTM Networks for Word Sense Disambiguation. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_10
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-030-30492-8
Online ISBN:978-3-030-30493-5
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative