Movatterモバイル変換

[0]ホーム

Jump to content

UBY

Add links

From Wikipedia, the free encyclopedia

Software for natural language processing

For the ISO 639 code, seeUbykh language.

UBY
Version	1.7
Framework	Java
Type	Multilingual lexical semantic resource
License	Free licenses for the software, mix of licenses for the included resources
Website	https://www.ukp.tu-darmstadt.de/data/lexical-resources/uby

UBY^[1] is a large-scale lexical-semantic resource fornatural language processing (NLP) developed at theUbiquitous Knowledge Processing Lab (UKP) in the department of Computer Science of theTechnische Universität Darmstadt .UBY is based on theISO standard Lexical Markup Framework (LMF) and combines information from several expert-constructed and collaboratively constructed resources for English and German.

UBY applies a word sense alignment approach (subfield ofword sense disambiguation) for combining information about nouns and verbs.^[2]Currently, UBY contains 12 integrated resources in English and German.

Included resources

[edit]

English resources:WordNet,Wiktionary,Wikipedia,FrameNet,VerbNet, OmegaWiki
German resources:German Wikipedia,German Wiktionary, OntoWiktionary,GermaNet and IMSLex-Subcat
Multilingual resources: OmegaWiki.

Format

[edit]

Main article:UBY-LMF

UBY-LMF^[3]^[4] is a format for standardizing lexical resources for Natural Language Processing (NLP).^[5] UBY-LMF conforms to the ISO standard for lexicons:LMF, designed within theISO-TC37, and constitutes a so-called serialization of this abstract standard.^[6] In accordance with the LMF, all attributes and other linguistic terms introduced in UBY-LMF refer to standardized descriptions of their meaning inISOCat.

Availability and versions

[edit]

UBY is available as part of the open resource repository DKPro. DKPro UBY is a Java framework for creating and accessing sense-linked lexical resources in accordance with theUBY-LMF lexicon model. While the code of UBY is licensed under a mix of free licenses such asGPL andCC by SA, some of the included resources are under different licenses such asacademic use only.

There is also aSemantic Web version of UBY called lemonUby.^[7] lemonUby is based on the lemon model as proposed in the Monnet project. lemon is a model for modeling lexicon and machine-readable dictionaries and linked to the Semantic Web and the Linked Data cloud.

UBY vs. BabelNet

[edit]

BabelNet is an automatically lexical semantic resource that linksWikipedia to the most popular computational lexicons such asWordNet. At first glance, UBY and BabelNet seem to be identical and competitive projects; however, the two resources follow different philosophies.In its early stage, BabelNet was primarily based on the alignment of WordNet and Wikipedia, which by the very nature of Wikipedia implied a strong focus on nouns, and especially named entities. Later on, the focus of BabelNet was shifted more towards other parts of speech. UBY, however, was focused from the very beginning on verb information, especially, syntactic information, which is contained in resources, such asVerbNet orFrameNet. Another main difference is that UBY models other resources completely and independently from each other, so that UBY can be used as wholesale replacement of each of the contained resources. A collective access to multiple resources is provided through the available resource alignments. Moreover, the LMF model in UBY allows unified way of access for all as well as individual resources. Meanwhile, BabelNet follow an approach similar to WordNet and bakes selected information types into so called Babel Synsets. This makes access and processing of the knowledge more convenient, however, it blurs the lines between the linked knowledge bases. Additionally, BabelNet enriches the original resources, e.g., by providing automatically created translations for concepts which are not lexicalized in a particular language. Although this provides a great boost of coverage for multilingual applications, the automatic inference of information is always prone to a certain degree of error.

In summary, due to the listed differences between the two resources, the usage of one or the other might be preferred depending on the particular application scenario. In fact, the two resources can be used to provide extensive lexicographic knowledge, especially, if they are linked together. The open and well-documented structure of the two resource provide a crucial milestone to achieve this goal.

Applications

[edit]

UBY has been successfully used in different NLP tasks such asWord Sense Disambiguation,^[8] Word Sense Clustering,^[9] Verb Sense Labeling^[10] andText Classification.^[11] UBY also inspired other projects on automatic construction of lexical semantic resources.^[12] Furthermore, lemonUby was used to improvemachine translation results, especially, finding translations for unknown words.^[13]

External links

[edit]

References

[edit]

^Iryna Gurevych; Judith Eckle-Kohler; Silvana Hartmann; Michael Matuschek; Christian M. Meyer; Christian Wirth (April 2012).UBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF. Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics. pp. 580–590.ISBN 978-1-937284-19-0.S2CID 9692934.Wikidata Q51752742.{{cite book}}:|journal= ignored (help)
^Matuschek, Michael: Word Sense Alignment of Lexical Resources. Technische Universität, Darmstadt [Dissertation], (2015)
^Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek, Christian M Meyer: UBY-LMF – exploring the boundaries of language-independent lexicon models, in Gil Francopoulo,LMF Lexical Markup Framework, ISTE / Wiley 2013 (ISBN 978-1-84821-430-9)
^Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek and Christian M. Meyer. UBY-LMF – A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF. In: Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), p. 275--282, May 2012.
^Gottfried Herzog, Laurent Romary, Andreas Witt: Standards for Language Resources. Poster Presentation at the META-FORUM 2013 – META Exhibition, September 2013, Berlin, Germany.
^Laurent Romary: TEI and LMF crosswalks. CoRR abs/1301.2444 (2013)
^Judith Eckle-Kohler, John Philip McCrae and Christian Chiarcos: lemonUby – a large, interlinked, syntactically-rich lexical resource for ontologies. In: Semantic Web Journal, vol. 6, no. 4, p. 371-378, 2015.
^Christian M. Meyer and Iryna Gurevych: To Exhibit is not to Loiter: A Multilingual, Sense-Disambiguated Wiktionary for Measuring Verb Similarity, in: Proceedings of the 24th International Conference on Computational Linguistics (COLING), Vol. 4, p. 1763–1780, December 2012. Mumbai, India.
^Michael Matuschek, Tristan Miller and Iryna Gurevych: A Language-independent Sense Clustering Approach for Enhanced WSD. In: Josef Ruppert and Gertrud Faaß: Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014), p. 11-21, Universitätsverlag Hildesheim, October 2014.
^Kostadin Cholakov and Judith Eckle-Kohler and Iryna Gurevych : Automated Verb Sense Labelling Based on Linked Lexical Resources. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014), p. 68-77, Association for Computational Linguistics
^Lucie Flekova and Iryna Gurevych: Personality Profiling of Fictional Characters using Sense-Level Links between Lexical Resources, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), September 2015.
^José Gildo de A. Júnior, Ulrich Schiel, and Leandro Balby Marinho. 2015. An approach for building lexical-semantic resources based on heterogeneous information sources. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC '15). ACM, New York, USA, 402-408. DOI=10.1145/2695664.2695896http://doi.acm.org/10.1145/2695664.2695896
^J. P. McCrae, P. Cimiano: Mining translations from the web of open linked data, in: Proceedings of the Joint Workshop on NLP&LOD and SWAIE: Semantic Web, Linked Open Data and Information Extraction, pp 9-13 (2013).

Retrieved from "https://en.wikipedia.org/w/index.php?title=UBY&oldid=1235670369"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Included resources

Format

Availability and versions

UBY vs. BabelNet

Applications

See also

External links

References