Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Eliminating Incorrect Cross-Language Links in Wikipedia

  • Conference paper
  • First Online:

Part of the book series:Lecture Notes in Computer Science ((LNISA,volume 10570))

Included in the following conference series:

  • 1479Accesses

Abstract

Many Wikipedia articles that cover the same topic in different language editions are interconnected via cross-language links that enable the understanding of topics in multiple languages, as well as cross-language information retrieval applications. However, cross-language links are added manually by the users of Wikipedia and, as such, are often incorrect. In this paper, we propose an approach to automatically eliminate incorrect cross-language links based on the observation that groups of articles that are pairwise connected through cross-language links form independent connected components. For eachincoherent component (i.e., one that contains two or more articles from the same language edition), our approach assigns acorrectness score to its crosslinks and removes those with the lowest score to make the component coherent. The results of our evaluation on a snapshot of Wikipedia in 8 languages indicates that our approach shows quantitative promise.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Similar content being viewed by others

References

  1. Adafre, S.F., de Rijke, M.: Finding similar sentences across multiple languages in Wikipedia. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pp. 62–69 (2006)

    Google Scholar 

  2. Adar, E., Skinner, M., Weld, D.S.: Information arbitrage across multi-lingual Wikipedia. In: Proceedings of WSDM, pp. 94–103. ACM (2009)

    Google Scholar 

  3. Bennacer, N., Johnson Vioulès, M., López, M.A., Quercini, G.: A multilingual approach to discover cross-language links in Wikipedia. In: Wang, J., Cellary, W., Wang, D., Wang, H., Chen, S.-C., Li, T., Zhang, Y. (eds.) WISE 2015. LNCS, vol. 9418, pp. 539–553. Springer, Cham (2015). doi:10.1007/978-3-319-26190-4_36

    Chapter  Google Scholar 

  4. Bolikowski, Ł.: Scale-free Topology of the Interlanguage Links in Wikipedia. arXiv preprintarXiv:0904.0564 (2009)

  5. de Melo G., Weikum, G.: MENTA: inducing multilingual taxonomies from Wikipedia. In: Procedings of CIKM, pp. 1099–1108. ACM (2010)

    Google Scholar 

  6. de Melo, G., Weikum, G.: Untangling the cross-lingual link structure of Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 844–853. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  7. Moreira, C.E.M., Moreira, V.P.: Finding missing cross-language links in Wikipedia. JIDM4(3), 251–265 (2013)

    Google Scholar 

  8. Penta, A., Quercini, G., Reynaud, C., Shadbolt, N.: Discovering cross-language links in Wikipedia through semantic relatedness. In: Proceedings of ECAI, pp. 642–647 (2012)

    Google Scholar 

  9. Rinser, D., Lange, D., Naumann, F.: Cross-lingual entity matching and infobox alignment in Wikipedia. Inf. Syst.38(6), 887–907 (2013)

    Article  Google Scholar 

  10. Sorg, P., Cimiano, P.: Enriching the crosslingual link structure of Wikipedia-a classification-based approach. In: Proceedings of the AAAI 2008 Workshop on Wikipedia and Artificial Intelligence, pp. 49–54 (2008)

    Google Scholar 

  11. Sorg, P., Cimiano, P.: Exploiting Wikipedia for cross-lingual and multilingual information retrieval. Data Knowl. Eng.74, 26–45 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. LRI, CentraleSupélec, Paris-Saclay University, 91190, Gif-sur-Yvette, France

    Nacéra Bennacer, Francesca Bugiotti, Jorge Galicia, Mariana Patricio & Gianluca Quercini

Authors
  1. Nacéra Bennacer

    You can also search for this author inPubMed Google Scholar

  2. Francesca Bugiotti

    You can also search for this author inPubMed Google Scholar

  3. Jorge Galicia

    You can also search for this author inPubMed Google Scholar

  4. Mariana Patricio

    You can also search for this author inPubMed Google Scholar

  5. Gianluca Quercini

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toGianluca Quercini.

Editor information

Editors and Affiliations

  1. University of Sydney, Darlington, NSW, Australia

    Athman Bouguettaya

  2. Zhejiang University, Hangzhou, China

    Yunjun Gao

  3. Institute of Computing for Physics and Technology, Protvino, Russia

    Andrey Klimenko

  4. Nanyang Technological University, Singapore, Singapore

    Lu Chen

  5. King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

    Xiangliang Zhang

  6. Institute of Computing for Physics and Technology, Protvino, Russia

    Fedor Dzerzhinskiy

  7. Shanghai Jiao Tong University, Minhang Qu, China

    Weijia Jia

  8. Institute of Computing for Physics and Technology, Protvino, Russia

    Stanislav V. Klimenko

  9. City University of Hong Kong, Kowloon, Hong Kong

    Qing Li

Rights and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Bennacer, N., Bugiotti, F., Galicia, J., Patricio, M., Quercini, G. (2017). Eliminating Incorrect Cross-Language Links in Wikipedia. In: Bouguettaya, A.,et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10570. Springer, Cham. https://doi.org/10.1007/978-3-319-68786-5_9

Download citation

Publish with us


[8]ページ先頭

©2009-2025 Movatter.jp