Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation

  • Conference paper
  • First Online:

Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 10024))

Included in the following conference series:

Abstract

This paper provides a comparative analysis of the performance of four state-of-the-art distributional semantic models (DSMs) over 11 languages, contrasting the native language-specific models with the use of machine translation over English-based DSMs. The experimental results show that there is a significant improvement (average of 16.7 % for the Spearman correlation) by using state-of-the-art machine translation approaches. The results also show that the benefit of using the most informative corpus outweighs the possible errors introduced by the machine translation. For all languages, the combination of machine translation over the Word2Vec English distributional model provided the best results consistently (average Spearman correlation of0.68).

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

Notes

  1. 1.

    The service is available athttp://rebrand.ly/dinfra.

References

  1. Al-Rfou, R., Perozzi, B., Skiena, S.: Polyglot: distributed word representations for multilingual NLP. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 183–192. Association for Computational Linguistics, Sofia, August 2013.http://www.aclweb.org/anthology/W13-3520

  2. Barzegar, S., Sales, J.E., Freitas, A., Handschuh, S., Davis, B.: Dinfra: a one stop shop for computing multilingual semantic relatedness. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015, 1027–1028. ACM, New York (2015).http://doi.acm.org/10.1145/2766462.2767870

  3. Bruni, E., Tran, N.K., Baroni, M.: Multimodal distributional semantics. J. Artif. Int. Res.49(1), 1–47 (2014).http://dl.acm.org/citation.cfm?id=2655713.2655714

    MathSciNet MATH  Google Scholar 

  4. Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: A framework for the construction of monolingual and cross-lingual word similarity datasets. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), pp. 1–7. Citeseer (2015)

    Google Scholar 

  5. Faruqui, M., Dyer, C.: Community evaluation and exchange of word vectors at wordvectors.org (2014)

    Google Scholar 

  6. Faruqui, M., Dyer, C.: Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 462–471. Association for Computational Linguistics, Gothenburg, April 2014.http://www.aclweb.org/anthology/E14-1049

  7. Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: the concept revisited. In: Proceedings of the 10th International Conference on World Wide Web, pp. 406–414. ACM (2001)

    Google Scholar 

  8. Freitas, A.: Schema-agnositc queries over large-schema databases: a distributional semantics approach. Ph.D. thesis, Digital Enterprise Research Institute (DERI), National University of Ireland, Galway (2015)

    Google Scholar 

  9. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI 2007, pp. 1606–1611. Morgan Kaufmann Publishers Inc., San Francisco (2007).http://dl.acm.org/citation.cfm?id=1625275.1625535

  10. Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist.41(4), 665–695 (2015)

    Article MathSciNet  Google Scholar 

  11. Jurgens, D., Stevens, K.: The s-space package: an open source package for word space models. In: Proceedings of the ACL 2010 System Demonstrations, ACLDemos 2010, pp. 30–35. Association for Computational Linguistics, Stroudsburg (2010).http://dl.acm.org/citation.cfm?id=1858933.1858939

  12. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process.25(2–3), 259–284 (1998)

    Article  Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR Workshop Papers (2013)

    Google Scholar 

  14. Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process.6(1), 1–28 (1991)

    Article  Google Scholar 

  15. Navigli, R., Ponzetto, S.P.: Babelrelate! A joint multilingual approach to computing semantic relatedness. In: AAAI Conference on Artificial Intelligence (2012)

    Google Scholar 

  16. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), vol. 12, pp. 1532–1543 (2014)

    Google Scholar 

  17. Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM8(10), 627–633 (1965)

    Article  Google Scholar 

  18. Sales, J.E., Freitas, A., Davis, B., Handschuh, S.: A compositional-distributional semantic model for searching complex entity categories. In: Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics (*SEM), pp. 199–208 (2016)

    Google Scholar 

  19. Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Int. Res.37(1), 141–188 (2010).http://dl.acm.org/citation.cfm?id=1861751.1861756

    MathSciNet MATH  Google Scholar 

  20. Utt, J., Pad, S.: Crosslingual and multilingual construction of syntax-based vector space models. Trans. Assoc. Comput. Linguist.2, 245–258 (2014)

    Google Scholar 

  21. Zou, W.Y., Socher, R., Cer, D.M., Manning, C.D.: Bilingual word embeddings for phrase-based machine translation. In: EMNLP, pp. 1393–1398 (2013)

    Google Scholar 

Download references

Acknowledgments

This publication has emanated from research supported by the National Council for Scientific and Technological Development, Brazil (CNPq) and by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289.

Author information

Authors and Affiliations

  1. Department of Computer Science and Mathematics, University of Passau, Innstrasse 43, ITZ-110, 94032, Passau, Germany

    André Freitas, Juliano Efson Sales & Siegfried Handschuh

  2. Insight Centre for Data Analytics, National University of Ireland, Galway, IDA Business Park, Lower Dangan, Galway, Ireland

    Siamak Barzegar & Brian Davis

Authors
  1. André Freitas

    You can also search for this author inPubMed Google Scholar

  2. Siamak Barzegar

    You can also search for this author inPubMed Google Scholar

  3. Juliano Efson Sales

    You can also search for this author inPubMed Google Scholar

  4. Siegfried Handschuh

    You can also search for this author inPubMed Google Scholar

  5. Brian Davis

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toAndré Freitas.

Editor information

Editors and Affiliations

  1. Linköping University, Linköping, Sweden

    Eva Blomqvist

  2. University of Bologna, Bologna, Italy

    Paolo Ciancarini

  3. University of Bologna, Bologna, Italy

    Francesco Poggi

  4. University of Bologna, Bologna, Italy

    Fabio Vitali

Rights and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Freitas, A., Barzegar, S., Sales, J.E., Handschuh, S., Davis, B. (2016). Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_14

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp