Personal Information

Dr. Richard Eckart de Castilho
(last name: Eckart de Castilho, first name: Richard)
Technical Lead
Richard Eckart de Castilho
+49 (6151) 16 - 25299
+49 (6151) 16 - 25295
S2|02 B117
TU Darmstadt - FB 20
Hochschulstraße 10
64289 Darmstadt



Biographical Information

I earned my Diplom in Computer Science at the TU Darmstadt in 2006. From that time until September 2008, I worked for the Linguistische Profile interdisziplinärer Register (LingPro) DFG project at the TU Darmstadt. I  gathered more experience with the UIMA framework during a three-month stay at the IBM Watson Research Center in Hawthorne, New York.

Current projects

Darmstadt Knowledge Processing Repository (DKPro)

I currently work as a technical lead on the Darmstadt Knowledge Processing Software Repository Project (DKPro). DKPro covers many projects. Some that I work more intensively on are listed below separately. My main responsibilities are development process optimization, release management, documentation and dissemination activities. 

DKPro Core

DKPro Core is a collection of interoperable NLP components building up on the Apache UIMA framework. The project integrates many proven tools and resources from the NLP community into a common processing framework and to provide a common abstraction layer. Through the use of Apache UIMA, uimaFIT and Apache Maven, DKPro Core offers convenient access to NLP components and implementation of NLP pipelines. I am one of the main developers of DKPro Core.

DKPro Lab

DKPro Lab is a lightweight framework for parameter sweeping experiments. It allows to set up experiments consisting of multiple interdependent tasks in a declarative manner with minimal overhead. Data produced by a task for any particular parameter configuration is stored and re-used whenever possible to avoid the needless recalculation of results. Reports can be attached to each task to post-process the experimental results and present them in a convenient manner, e.g. as tables or charts. I am the main developer.

Apache uimaFIT™

Apache uimaFIT, an open source library that provides factories, injection, and testing utilities for Apache UIMA™. uimaFIT is a core technology at UKP because it greatly simplifies development of UIMA components. Thus it allows us to concentrate on implementing actual NLP functionality. I am a committer on the Apache UIMA project, working primarily on uimaFIT.

TT4J - TreeTagger for Java

TreeTagger for Java (TT4J) is an open source library which provides a Java API to Helmut Schmid's TreeTagger. TT4J is used by the DKPro TreeTagger UIMA component. I am the main developer.

JWPL - Java-based Wikipedia Library

JWPL (Java Wikipedia Library) is a free, Java-based application programming interface that allows to access all information contained in Wikipedia. I am a consulting developer.


A Legal Perspective on Training Models for Natural Language Processing

Richard Eckart de Castilho, Giulia Dore, Penny Labropoulou, Tom Margoni, Iryna Gurevych
In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), p. to appear, May 2018
European Language Resources Association (ELRA)

Collaborative Web-based Tools for Multi-layer Text Annotation

Chris Biemann, Kalina Bontcheva, Richard Eckart de Castilho, Iryna Gurevych, Seid Muhie Yimam
In: The Handbook of Linguistic Annotation, p. 229--256, May 2017
Springer Netherlands

A tool for extracting sense-disambiguated example sentences through user feedback

Beto Boullosa, Richard Eckart de Castilho, Alexander Geyken, Lothar Lemnitzer, Iryna Gurevych
In: Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics, p. 69--72, April 2017
Association for Computational Linguistics

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, p. 67--75, April 2017
Association for Computational Linguistics

Automatic Analysis of Flaws in Pre-Trained NLP Models

Richard Eckart de Castilho
In: Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI3nOIAF2) at COLING 2016, p. 19--27, December 2016

A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures

Richard Eckart de Castilho, Éva Mújdricza-Maydt, Seid Muhie Yimam, Silvana Hartmann, Iryna Gurevych, Anette Frank, Chris Biemann
In: Proceedings of the workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) at COLING 2016, p. 76--84, December 2016

Text mining resources for the life sciences

Piotr Przybyła, Matthew Shardlow, Sophie Aubin, Robert Bossy, Richard Eckart de Castilho, Stelios Piperidis, John McNaught, Sophia Ananiadou
In: Database, Vol. 2016, p. 1--30, November 2016

Sense-annotating a lexical substitution data set with Ubyline

Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), p. 828--835, May 2016
European Language Resources Association (ELRA)

Interoperability = f(community, division of labour)

Richard Eckart de Castilho
In: Proceedings of the Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability collocated with LREC 2016, p. 24--28, May 2016

In-tool Learning for Selective Manual Annotation in Large Corpora

Erik-Lân Do Dinh, Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, p. 13--18, July 2015
Association for Computational Linguistics and The Asian Federation of Natural Language Processing
