Under the heading of Language Technology for Digital Humanities, UKP Lab conducts projects at the boundary between Natural Language Processing, Computer Science on the one hand, and Humanities, Social Sciences, and Educational Research on the other hand. In particular, we work on making digital analysis methods more accessible to text-based humanities, implement tools to explore and annotate text corpora, and contribute to the infrastructures supporting Digital Humanities. Our research interests in this area include:

  • Creating user-friendly tools to explore and annotate text corpora

  • Analyzing corpora at the semantic level, e.g. opinion mining or identifying metaphoric language

  • Processing and analyzing historical texts

  • Interoperability with Digital Humanities infrastructures such as DARIAH and CLARIN

Current Projects

  • DKPro: At UKP, we believe in supporting reproducible NLP research through re-usable and freely available software components. To this end, UKP created the award-winning DKPro repository of open-source software covering many aspects of NLP from pre-processing, lexical resource, machine-learning, to semantic analysis. As DKPro is growing and gaining popularity, it now starts evolving into a community project in which UKP collaborates e.g. with researchers from the University of Duisburg-Essen.

  • CLARIN F-AG7 KP 3: In association with the CLARIN project, we are building the flexible, web-based annotation tool WebAnno and apply it to the annotation of non-standard varieties of German at the semantic level. This work is done in collaboration with the Language Technology Group in Darmstadt and with researchers from the University of Heidelberg.

  • CEDIFOR: In this context , we aim to foster interdisciplinary work between Computer Science and Digital Humanities by providing know-how and research infrastructures for text analytics to humanities researchers in the Rhein-Main area, supporting them in their investigation of novel research questions. This project is conducted in collaboration with the Goethe University Frankfurt am Main and the German Institute for International Educational Research (DIPF).

  • DARIAH-DE II: The mission of the EU-ESFRI-Project DARIAH-EU is to enhance and support digitally-enabled research across the arts and humanities. In the context of the second phase of the German contribution DARIAH-DE, UKP collaborates closely with researchers from the Julius Maximilians University of Würzburg to automatically detect and analyze narrative structures in German. These techniques are applied to a corpus of around 2.000 novels, which were written over the last centuries.

  • Welt der Kinder: The digital humanities project “Welt der Kinder” is designed as a test model for future similar projects in historical sciences. By very close cooperation between historians, information scientists, and computer scientists, it aims to gain new insights about the way the world was conveyed to children in a period from 1850 until 1918 - a time in of accelerated production of knowledge that was equally dominated by globalization and nationalisation.
  • Processing of audiovisual content: The amount of audiovisual content is constantly increasing, specially in the educational domain, making tasks like transcription and visual analysis a very cumbersome activity for humanistic researchers. This project aims to create technology which facilitates the integration of manual and automatic analysis of audiovisual content.
  • OpenMinTeD: OpenMinTeD aspires to enable the creation of an infrastructure that fosters and facilitates the discovery and use of text mining technologies and interoperable services. It examines several use cases identified by experts from different scientific areas, ranging from generic scholarly communication to literature related to life sciences, food and agriculture, and social sciences and humanities.

Past Projects

  • DARIAH-DE I: The mission of the EU-ESFRI-Project DARIAH-EU is to enhance and support digitally-enabled research across the arts and humanities. In the first phase of the German contribution DARIAH-DE, UKP investigated possibilities of using the emerging DARIAH infrastructure by means of the use-case of setting up a digital archive and by means of integrating DARIAH and TextGrid services.

  • CLARIN F-AG7 KP 1: In association with the CLARIN project, we developed the flexible web-based annotation tool WebAnno. The tool supports visual annotation of multiple linguistic layers, including custom defined layers. It is interoperable with CLARIN infrastructures such as WebLicht. The tool has been developed in closed cooperation with the CLARIN F-AG7 KP 2 project, which defines “best practices” for linguistic annotation on several language layers for different annotator status groups. This work has been done in collaboration with the Language Technology Group in Darmstadt.

  • LOEWE Research Center “Digital Humanities” TP 2.2 “Text as an Instance”: In this project, UKP collaborated very closely with linguists and computational linguists on the comparative analysis of non-canonical grammatical constructions in German and English. Due to the infrequence and ambiguity of such constructions, a dedicated analysis process and supporting tools needed to be developed for annotation. The result of this is the CSniper annotation tool that combines collaborative search and annotation into a user-friendly tool. This project has been conducted with researchers from the Department of Linguistics and Literature in Darmstadt as well as from the Goethe University in Frankfurt am Main.

  • LOEWE Research Center “Digital Humanities” TP 2.3 “Text as a Process”: This project analyzed the linguistic properties of collaboratively created text in the Web 2.0. For more details, please refer to the respective section in the Text Analytics area description.

Completed PhD Theses


Additional Attributes


Collaborative Web-based Tools for Multi-layer Text Annotation

Chris Biemann, Kalina Bontcheva, Richard Eckart de Castilho, Iryna Gurevych, Seid Muhie Yimam
In: The Handbook of Linguistic Annotation, p. 229--256, May 2017
Springer Netherlands
[Online-Edition: http://www.springer.com/de/book/9789402408799]

Sense-annotating a lexical substitution data set with Ubyline

Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), p. 828--835, May 2016
European Language Resources Association (ELRA)
[Online-Edition: https://www.ukp.tu-darmstadt.de/data/sense-labelling-resources/glass/]

Mass Collaboration on the Web: Textual Content Analysis by Means of Natural Language Processing

Ivan Habernal, Johannes Daxenberger, Iryna Gurevych
In: Mass Collaboration and Education, Vol. 16, p. 367-390, February 2016
Springer International Publishing
[Online-Edition: http://doi.org/10.1007/978-3-319-13536-6_18]

Enriching Wikidata with Frame Semantics

Hatem Mousselly Sergieh, Iryna Gurevych
In: Proceedings of the 5th Workshop on Automated Knowledge Base Construction (AKBC) 2016 held in conjunction with NAACL 2016, p. 29-34, 2016
Association for Computational Linguistics
[Online-Edition: 12130]

In-tool Learning for Selective Manual Annotation in Large Corpora

Erik-Lân Do Dinh, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, p. 13--18, July 2015
Association for Computational Linguistics and The Asian Federation of Natural Language Processing
[Online-Edition: https://dkpro.github.io/dkpro-csniper/]

Analyzing Formulaic Patterns in Historical Corpora

Claudine Moulin, Iryna Gurevych, Natalia Filatkina, Richard Eckart de Castilho
In: Historical Corpora. Challenges and Perspectives., p. 51-64, 2015
Narr Publishing House

A Language-independent Sense Clustering Approach for Enhanced {WSD}

Michael Matuschek, Tristan Miller, Iryna Gurevych
In: Proceedings of the 12th Konferenz zur Verarbeitung natürlicher Sprache (KONVENS 2014), p. 11-21, October 2014
Universitätsverlag Hildesheim
[Online-Edition: https://www.ukp.tu-darmstadt.de/data/lexical-resources/wordnetgermanet-sense-clusters/]

GermEval-2014: Nested Named Entity Recognition with Neural Networks

Nils Reimers, Judith Eckle-Kohler, Carsten Schnober, Jungi Kim, Iryna Gurevych
In: Workshop Proceedings of the 12th Edition of the KONVENS Conference, p. 117-120, October 2014
Universitätsverlag Hildesheim
[Online-Edition: https://www.ukp.tu-darmstadt.de/research/ukp-in-challenges/germeval-2014/]

High Performance Word Sense Alignment by Joint Modeling of Sense Distance and Gloss Similarity

Michael Matuschek, Iryna Gurevych
In: Proceedings of the the 25th International Conference on Computational Linguistics (COLING 2014), p. 245-256, August 2014
Dublin City University and Association for Computational Linguistics
[Online-Edition: http://www.aclweb.org/anthology/C14-1025]
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang