Part of the book series:Lecture Notes in Computer Science ((LNCS,volume 13397))
Included in the following conference series:
327Accesses
Abstract
In this paper we describe and evaluate Arbobanko, a syntactic treebank for the artificial language Esperanto, as well as methods and tools used to produce the treebank. For an under-resourced language, the quality of automatic syntactic pre-annotation is of obvious importance, and by evaluating the parser associated with the treebank, we try to answer the question whether the language's extremely regular morphology and low lexical ambiguity carry over into a more regular syntax and higher parsing accuracy. On the linguistic side, the treebank allows us to address and quantify the typological issue of (free) word order in Esperanto.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 9151
- Price includes VAT (Japan)
- Softcover Book
- JPY 11439
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Adjects are defined as adverbial modifiers in adjp's and advp's, i.e. of adjectives and adverbs.
- 2.
For Esperanto, we have adopted the "semantic prototype" ontology described athttp://visl.sdu.dk/semantic_prototypes_overview.pdf.
- 3.
Yes/no questions with the question particle "ĉu" were not excluded, but were not statistically salient, because only a few contained finite verbs.
- 4.
Non-finite clauses do not take subjects in Esperanto.
- 5.
These participles carry an adjectival -a ending, and inflect/agree with regard to number and case, allowing them to function as postnominal non-finite clauses, marked @ICL-N < in the treebank, unlike the @ICL-AUX < (argument of auxiliary) we are concerned with here.
References
Bejček, E., Hajičová, E., Hajič, J., et al.: Prague Dependency Treebank 3.0. (Data/Software). Charles University in Prague, MFF, ÚFAL. [http://ufal.mff.cuni.cz/pdt3.0/] (2013)
Bick, E., Didriksen, T.: CG-3 - beyond classical constraint grammar. In: Beáta, M., Proceedings of NODALIDA 2015, May 11–13, 2015, Vilnius, Lithuania, pp. 31–39. LiU Electronic Press, Linköping (2015)
Bick, E.: PALAVRAS, a Constraint Grammar-based Parsing System for Portuguese. In: Tony Berber, S., de Lurdes São Bento Ferreira, T. (eds.), Working with Portuguese Corpora, pp 279–302. Bloomsburry Academic, London/New York (2014)
Bick, E.: A Dependency Constraint Grammar for Esperanto. Constraint Grammar Workshop at NODALIDA 2009, Odense. NEALT Proceedings Series, Vol. 8, pp. 8–12. Tartu: Tartu University Library (2009)
Bick, E.: Tagging and parsing an artificial language: an annotated web-corpus of Esperanto. In: Proceedings of Corpus Linguistics 2007, Birmingham, UK. [http://ucrel.lancs.ac.uk/publications/CL2007/] (2007)
Bick, E.: Arboretum, a Hybrid Treebank for Danish. In: Joakim, N., Hinrich, E. (eds.) Proceedings of TLT 2003 (2nd Workshop on Treebanks and Linguistic Theory, Växjö, 14–15 November 2003), pp. 9–20. Växjö University Press (2003)
Böhmová, A., Hajič, J., Panenová, B.H.J., Hajicova, E.: The Prague dependency treebank: a 3-level annotation scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora. Dordrecht, the Netherlands: Kluwer, pp. 103–126 (2003)
Johansson, R., Nugues, P.: Extended constituent-to-dependency conversion for English. In: Proceedings of NODALIDA 2007. Tartu, Estonia (2007)
McDonald, R., et al.: Universal dependency annotation for multilingual parsing. In: Proceedings of ACL 2013 (2013)
Karlsson, F.: Constraint grammar as a framework for parsing running text. In: Proceedings of the 13th Conference on Computational, vol. 3, pp. 168–173. ACL (1990)
Nivre, J., et al.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 915–932 (2007)
Prytz, K.: Evaluation of the syntactic parsing performed by the ENGCG parser. In: Proceedings of the 11th Nordic Conference on Computational Lingusitics, Copenhagen, 28–29 January 1998. ACL web anthology (1998)
Author information
Authors and Affiliations
Institute of Language and Communication, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
Eckhard Bick
- Eckhard Bick
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toEckhard Bick.
Editor information
Editors and Affiliations
Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Bick, E. (2023). Arbobanko - A Treebank for Esperanto. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13397. Springer, Cham. https://doi.org/10.1007/978-3-031-23804-8_20
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-031-23803-1
Online ISBN:978-3-031-23804-8
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative