Personal Information

Dr. Richard Eckart de Castilho
(last name: Eckart de Castilho, first name: Richard)
Technical Lead
+49 (6151) 16 - 25299
+49 (6151) 16 - 25295
S2|02 B117
TU Darmstadt - FB 20
Hochschulstraße 10
64289 Darmstadt



Biographical Information

I earned my Diplom in Computer Science at the TU Darmstadt in 2006. From that time until September 2008, I worked for the Linguistische Profile interdisziplinärer Register (LingPro) DFG project at the TU Darmstadt. I  gathered more experience with the UIMA framework during a three-month stay at the IBM Watson Research Center in Hawthorne, New York.

Current projects

Darmstadt Knowledge Processing Repository (DKPro)

I currently work as a technical lead on the Darmstadt Knowledge Processing Software Repository Project (DKPro). DKPro covers many projects. Some that I work more intensively on are listed below separately. My main responsibilities are development process optimization, release management, documentation and dissemination activities. 

DKPro Core

DKPro Core is a collection of interoperable NLP components building up on the Apache UIMA framework. The project integrates many proven tools and resources from the NLP community into a common processing framework and to provide a common abstraction layer. Through the use of Apache UIMA, uimaFIT and Apache Maven, DKPro Core offers convenient access to NLP components and implementation of NLP pipelines. I am one of the main developers of DKPro Core.

DKPro Lab

DKPro Lab is a lightweight framework for parameter sweeping experiments. It allows to set up experiments consisting of multiple interdependent tasks in a declarative manner with minimal overhead. Data produced by a task for any particular parameter configuration is stored and re-used whenever possible to avoid the needless recalculation of results. Reports can be attached to each task to post-process the experimental results and present them in a convenient manner, e.g. as tables or charts. I am the main developer.

Apache uimaFIT™

Apache uimaFIT, an open source library that provides factories, injection, and testing utilities for Apache UIMA™. uimaFIT is a core technology at UKP because it greatly simplifies development of UIMA components. Thus it allows us to concentrate on implementing actual NLP functionality. I am a committer on the Apache UIMA project, working primarily on uimaFIT.

TT4J - TreeTagger for Java

TreeTagger for Java (TT4J) is an open source library which provides a Java API to Helmut Schmid's TreeTagger. TT4J is used by the DKPro TreeTagger UIMA component. I am the main developer.

JWPL - Java-based Wikipedia Library

JWPL (Java Wikipedia Library) is a free, Java-based application programming interface that allows to access all information contained in Wikipedia. I am a consulting developer.


I have co-organized the following events:


A tool for extracting sense-disambiguated example sentences through user feedback

Beto Boullosa, Richard Eckart de Castilho, Alexander Geyken, Lothar Lemnitzer, Iryna Gurevych
In: Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, p. 69--72, April 2017
Association for Computational Linguistics

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, p. to appear, April 2017

Collaborative Web-based Tools for Multi-layer Text Annotation

Chris Biemann, Kalina Bontcheva, Richard Eckart de Castilho, Iryna Gurevych, Seid Muhie Yimam
In: The Handbook of Linguistic Annotation, 2017
Springer Netherlands

Automatic Analysis of Flaws in Pre-Trained NLP Models

Richard Eckart de Castilho
In: Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI3nOIAF2) at COLING 2016, p. 19--27, December 2016

A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures

Richard Eckart de Castilho, Éva Mújdricza-Maydt, Seid Muhie Yimam, Silvana Hartmann, Iryna Gurevych, Anette Frank, Chris Biemann
In: Proceedings of the workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) at COLING 2016, p. 76--84, December 2016

Text mining resources for the life sciences

Piotr Przybyła, Matthew Shardlow, Sophie Aubin, Robert Bossy, Richard Eckart de Castilho, Stelios Piperidis, John McNaught, Sophia Ananiadou
In: Database, Vol. 2016, p. 1--30, November 2016

Sense-annotating a lexical substitution data set with Ubyline

Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), p. 828--835, May 2016
European Language Resources Association (ELRA)

Interoperability = f(community, division of labour)

Richard Eckart de Castilho
In: Proceedings of the Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability collocated with LREC 2016, p. 24--28, May 2016

In-tool Learning for Selective Manual Annotation in Large Corpora

Erik-Lân Do Dinh, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, p. 13--18, July 2015
Association for Computational Linguistics and The Asian Federation of Natural Language Processing

Analyzing Formulaic Patterns in Historical Corpora

Claudine Moulin, Iryna Gurevych, Natalia Filatkina, Richard Eckart de Castilho
In: Historical Corpora. Challenges and Perspectives., p. 51-64, 2015
Narr Publishing House
