UBY - The Resource

UBY is a large-scale instantiation of the UBY-LMF model. Currently, UBY consists of ten expert-built and collaboratively created lexical semantic resources in two languages, English and German. A subset of these resources is linked at the word sense level, yielding sense alignments between resource pairs.  There are monolingual and cross-lingual sense alignments between resources. The following figure shows the LMF lexicon instances and pairwise sense alignments between them. Solid lines represent sense alignments already available and dotted lines indicate sense alignments we will integrate in the future.

Downloads

We provide database dumps of UBY databases in two different sizes. The database dumps we provide contain only resources with open licenses, i.e., GermaNet is never included. The version number in the name of the database dump indicates, which version of the UBY-API the database dump is compatible with. See  details on Google Code.

You can download the database dumps here (older versions, such as the UBY 1.0 database dump can be found in the subfolder oldVersions):

Alternatively, you can create UBY databases yourself by converting the resources you need.

Find the source code for the conversion of free and licensed resources (e.g., GermaNet) and a  Conversion Tutorial on  Google Code. Please note that we do not provide tools for the import of sense alignments into a UBY database yet. In the near future, we will make the source code for the conversion of the sense alignments in UBY available as well.

Download additional materials required for the conversion here:


When using UBY, please cite the following paper:

Iryna GurevychJudith Eckle-KohlerSilvana HartmannMichael Matuschek
Christian M. Meyer, and Christian Wirth:
UBY  A Large-Scale Unified Lexical-Semantic Resource Based on LMF, in: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), p. 580-590, April 2012. Avignon, France.
PDF | BibTeX | ProceedingsSupplementary Data

Information Types

The following table gives an overview on the most important information types in UBY and their statistics:


 

 

Alignments

As shown in the figure above, UBY contains a number of pairwise sense alignments between resources. They are an important prerequisite for the semantic interoperability (with respect to word senses) within UBY. Some of these alignments were created manually by the resource developers, some of them automatically by state of the art algorithms.

The following table gives an overview on the sense alignments in UBY release 1.0:

 

Conversion to UBY-LMF

We provide our conversion routines as open source software on  Google Code.

We evaluated the conversion routines by comparing the statistics of the information types in the original expert-built lexical semantic resources to their counterparts in UBY. The original statistics were obtained by means of the resource APIs, unless otherwise stated. 

 We do not present this type of evaluation for the collaboratively created resources, as these are constantly updated: there is no stable reference version for, e.g., Wiktionary.

You can download the evaluation statistics here: uby_conversion_statistics_elsr.pdf

 

A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact
zum Seitenanfangzum Seitenanfang