Lexical-Semantic Resources and Algorithms is concerned with the analysis, design, and application of lexical-semantic resources (LSRs) for natural language processing. At the core of our work is a multi-year effort of developing the large-scale sense-linked unified resource UBY. UBY contains multiple expert-built and collaboratively constructed LSRs for English and German. Moreover, we are also interested in UBY’s applications to semantic processing tasks, such as Word Sense Disambiguation or Semantic Role Labeling, and in end user applications like Question Answering. Another important topic is utilizing LSRs in the domain of Digital Humanities.

Current Projects

  • UBY: UBY is a large-scale lexical-semantic resource based on the ISO standard Lexical Markup Framework (LMF), combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It is further developed as part of CEDIFOR, in cooperation with the research area Text Mining & Analytics. Most UBY related software are available as open source on Github.

  • Integrating Collaborative and Linguistic Resource for Word Sense Disambiguation and Semantic Role Labelin (InCoRe)In the InCoRe project, we address the lack of coverage typically associated with lexical semantic resources. The major goal of this project is the integration of various expert-built and collaboratively created lexical semantic resources to a large-scale resource of unprecedented coverage and quality. The second major goal of InCoRe is to scale natural language processing technologies utilizing lexical semantic resources, specifically word sense disambiguation and semantic role labeling, to real-life applications based on the developed resource.

  • Educational Web 2.0 (EduWeb): In the EduWeb project, we seek to implement our vision of technology enhanced education of the 21st century. A vast amount of content is produced by many people every day, but despite their interconnection through the World Wide Web, their efforts are often isolated from each other. To overcome this problem, the UKP Lab will provide and explore new algorithms to simplify tedious, recurring tasks as well as improving the coordination with the community.

  • QA-EduInf: Community-based Question Answering for Educational InformationThe project aims at using natural language processing techniques to analyze educational information and answer user questions on various educational topics. Since a large portion of users' questions have already been asked by other people in community question answering forums and answered by educational experts or crowds, we use the available question and answer archives to answer these questions and minimize human effort in searching through educational information. The project consists of different components including question classification, question and answer retrieval, answer quality assessment, and summarization.

  • Information Consolidation: A New Paradigm in Knowledge Search (DIP Project): The DIP project - an international cooperation with Bar-Ilan University and Israel Institute of Technology - aims at the next big step in information access technology. The goal is to support users in identifying and assimilating the large set of relevant statements found within multitudes of documents which are usually retrieved by the current search technologies. Novel methods for statement extraction, information consolidation, and inferring relations represent the core research areas within this project.

Past Projects

  • QA-EL: The project investigates novel applications of dynamic lexical-semantic resources (such as Wikipedia and other Web 2.0 sources) for information search in eLearning.

  • LOEWE Digital Humanities: This project deals with the analysis of contemporary corpora. At UKP, we are particularly researching the development and application of the linked lexical resource UBY in the context of humanities applications requiring structured semantic knowledge.

  • Semantic Information Retrieval 1,2,3 (SIR): This project systematically investigates the possible usage of semantic and lexical relationships between words or concepts for improving the information retrieval process. The main focus is on semantic relatedness measures using different knowledge sources (e.g. WordNet, GermaNet, or Wikipedia).

Data and Tools


  • UBY – the resource: Database dumps and related data

  • UBY-LMF - A Comprehensive Instantiation Of ISO-LMF

  • DKPro Uby: A Java framework for creating and accessing sense-linked lexical resources in accordance with the UBY-LMF lexicon model

Sense Alignments contained in UBY

Other Resources and Datasets

Other APIs and Tools

  • JOWKL: A Java-API for accessing the resource OmegaWiki

  • JWPL: A Java-API for accessing Wikipedia, as well as its revision history

  • JWKTL: A Java-API for accessing Wiktionary

  • DKPro-WSD: An open-source library for performing Word Sense Disambiguation


Additional Attributes


Context-Aware Representations for Knowledge Base Relation Extraction

Daniil Sorokin, Iryna Gurevych
In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), p. 1785-1790, September 2017
Association for Computational Linguistics
[Online-Edition: https://github.com/UKPLab/emnlp2017-relation-extraction]

LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test

Michael Bugert, Yevgeniy Puzikov, Andreas Rücklé, Judith Eckle-Kohler, Teresa Martin, Eugenio Martínez Cámara, Daniil Sorokin, Maxime Peyrard, Iryna Gurevych
In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem, held in conjunction with EACL2017), p. 56-61, April 2017
Association for Computational Linguistics
[Online-Edition: https://github.com/UKPLab/lsdsem2017-story-cloze]

A Consolidated Open Knowledge Representation for Multiple Texts

Rachel Wities, Vered Shwartz, Gabriel Stanowsky, Meni Adler, Ori Shapira, Shyam Upadhyay, Dan Roth, Eugenio Martínez Cámara, Iryna Gurevych, Ido Dagan
In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics, p. 12-24, April 2017
Association for Computational Linguistics

End-to-end Representation Learning for Question Answering with Weak Supervision

Daniil Sorokin, Iryna Gurevych
In: ESWC 2017 Semantic Web Challenges, p. (to appear), 2017

Linked Lexical Knowledge Bases: Foundations and Applications

Iryna Gurevych, Judith Eckle-Kohler, Michael Matuschek
In: Synthesis Lectures on Human Language Technologies, July 2016
Morgan & Claypool Publishers
[Online-Edition: http://www.morganclaypoolpublishers.com/catalog_Orig/product_info.php?cPath=22&series=29&products_id=958]

Sense-annotating a lexical substitution data set with Ubyline

Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), p. 828--835, May 2016
European Language Resources Association (ELRA)
[Online-Edition: https://www.ukp.tu-darmstadt.de/data/sense-labelling-resources/glass/]

Enriching Wikidata with Frame Semantics

Hatem Mousselly Sergieh, Iryna Gurevych
In: Proceedings of the 5th Workshop on Automated Knowledge Base Construction (AKBC) 2016 held in conjunction with NAACL 2016, p. 29-34, 2016
[Online-Edition: 12130]
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang