Darmstadt Knowledge Processing Repository

At the UKP Lab, we put a strong focus on developing the software that is the basis for our experiments in a re-usable manner. We call that body of software that we produce the Darmstadt Knowledge Processing Software Repository (DKPro). 

Products

Several products have grown from our DKPro philosophy and have been released under an open source license to the public:

  • CSniper is a search-based annotation tool to help distributed annotation teams finding infrequent linguistic phenomena in large corpora.
  • DKPro Core provides a set of ready to use software components for natural language processing, based on the Apache UIMA framework.
  • DKPro Lab is a lightweight framework for parameter sweeping experiments. It allows you to set up experiments consisting of multiple interdependent tasks in a declarative manner with minimal overhead.
  • DKPro LSR (Lexical Semantic Resources) is a unified API for several lexical-semantic resources.
  • DKPro Similarity is an open source software package for developing text similarity algorithms.
  • DKPro Spelling includes components for real-word spelling error correction and experimental frameworks for mining such errors from the Wikipedia revision history as well as for the "Helping Our Own" shared tasks 2011 and 2012.
  • DKPro Statistics is a collection of open-licensed statistical tools, currently including correlation and inter-rater agreement methods.
  • DKPro TC (Text Classification) is a UIMA-based text classification framework built on top of DKPro Core, DKPro Lab and the Weka Machine Learning Toolkit. It is intended to alleviate supervised machine learning experiments with any kind of textual data.
  • DKPro Uby is a Java framework for creating and accessing sense-linked lexical resources in accordance with the UBY-LMF lexicon model, an instantiation of the ISO standard Lexicon Markup Framework (LMF).
  • DKPro WSD is a modular, extensible Java framework for word sense disambiguation.
  • JOWKL (Java OmegaWiki Library) is an open-source, Java-based application programming interface that allows to access all information contained in OmegaWiki, such as glosses, usage examples, translations and much more.
  • JWKTL (Java Wiktionary Library) is a free, Java-based application programming interface that allows to access the information contained in Wiktionary.
  • JWPL (Java Wikipedia Library) is a free, Java-based application programming interface that allows to access all information contained in Wikipedia.
  • WebAnno is a general purpose web-based annotation tool for a wide range of linguistic annotations.

Team

The principal investigator is Prof. Dr. Iryna Gurevych.

Richard Eckart de Castilho is currently the technical lead.

DKPro is a shared project of all UKP to which all group members contribute.

Teaching

We use DKPro products in our courses:

Awards

The UKP group received two IBM's 2008 Unstructured Information Analytics (UIA) Awards for their DKPro proposals! The award was covered in the 30 June 2008 issue of the Darmstädter Echo.

Publications

Additional Attributes

Type

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, p. 67--75, April 2017
Association for Computational Linguistics
[Inproceedings]

Automatic Analysis of Flaws in Pre-Trained NLP Models

Richard Eckart de Castilho
In: Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI3nOIAF2) at COLING 2016, p. 19--27, December 2016
[Online-Edition: https://github.com/UKPLab/coling2016-modelinspector]
[Inproceedings]

Approaches to Automatic Text Structuring

Nicolai Erbs
September 2015
[Online-Edition: http://tuprints.ulb.tu-darmstadt.de/4959/]
[Phdthesis]

Counting What Counts: Decompounding for Keyphrase Extraction

Nicolai Erbs, Pedro Bispo Santos, Torsten Zesch, Iryna Gurevych
In: Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction, p. 10-17, July 2015
Association for Computational Linguistics
[Online-Edition: Technische Universität Darmstadt, 2015]
[Inproceedings]

Identifying Argumentative Discourse Structures in Persuasive Essays

Christian Stab, Iryna Gurevych
In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), p. 46-56, October 2014
Association for Computational Linguistics
[Online-Edition: www.ukp.tu-darmstadt.de/data/argumentation-mining/argument-annotated-essays/]
[Inproceedings]

A broad-coverage collection of portable NLP components for building shareable analysis pipelines

Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT (OIAF4HLT) at COLING 2014, p. 1--11, August 2014
Association for Computational Linguistics and Dublin City University
[Online-Edition: https://dkpro.github.io/dkpro-core/]
[Inproceedings]

Sense and Similarity: A Study of Sense-level Similarity Measures

Nicolai Erbs, Iryna Gurevych, Torsten Zesch
In: Proceedings of the 3rd Joint Conference on Lexical and Computational Semantics (*SEM 2014), p. 30--39, August 2014
Association for Computational Linguistics and Dublin City University
[Inproceedings]

DKPro Agreement: An Open-Source Java Library for Measuring Inter-Rater Agreement

Christian M. Meyer, Margot Mieskes, Christian Stab, Iryna Gurevych
In: Proceedings of the 25th International Conference on Computational Linguistics: System Demonstrations (COLING), p. 105--109, August 2014
Dublin City University and Association for Computational Linguistics
[Online-Edition: https://dkpro.github.io/dkpro-statistics/]
[Inproceedings]

Automatic Annotation Suggestions and Custom Annotation Layers in WebAnno

Seid Muhie Yimam, Richard Eckart de Castilho, Iryna Gurevych, Chris Biemann
In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. System Demonstrations, p. 91--96, June 2014
Association for Computational Linguistics
[Online-Edition: https://webanno.github.io/webanno/]
[Inproceedings]

DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data

Johannes Daxenberger, Oliver Ferschke, Iryna Gurevych, Torsten Zesch
In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, p. 61-66, June 2014
Association for Computational Linguistics
[Online-Edition: https://github.com/dkpro/dkpro-tc]
[Inproceedings]
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang