DKPro Core

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. The provided components wrap a constantly growing set of stand-of-the-art NLP tools and also include several original components covering a wide range of tasks including: tokenization/segmentation, compound splitting, stemming, part-of-speech tagging, lemmatization, constituency parsing, dependency parsing, named entity recognition, coreference resolution, language identification, spelling correction, grammar checking, and support for reading and writing various file and corpus formats.


DKPro Core relies heavily on uimaFIT and is meant to be used with Apache Maven. The main components are hosted on Maven Central, while distributable models are available from the public Maven repository at UKP Lab.

Documentation, source code and further instructions regarding DKPro Core can found on the GitHub Project Site.

The source is code is provided under different licenses, depending on the DKPro Core component:

  • DKPro Core ASL components use the Apache License 2.0
  • DKPro Core GPL components use the GNU General Public License 3.0


Additional Attributes


Distantly Supervised POS Tagging of Low-Resource Languages under Extreme Data Sparsity: The Case of Hittite

Maria Sukhareva, Francesco Fuscagni, Johannes Daxenberger, Susanne Görke, Doris Prechel, Iryna Gurevych
In: LaTeCH '17 Proceedings of the 11th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, p. to appear, August 2017

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, p. to appear, April 2017

Automatic Analysis of Flaws in Pre-Trained NLP Models

Richard Eckart de Castilho
In: Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI3nOIAF2) at COLING 2016, p. 19--27, December 2016

Interoperability = f(community, division of labour)

Richard Eckart de Castilho
In: Proceedings of the Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability collocated with LREC 2016, p. 24--28, May 2016

A Tool for NLP-Preprocessing in Literary Text Analysis

Nils Reimers, Fotis Jannidis, Steffen Pielström, Stefan Pernes, Isabella Reger
March 2016

DARIAH-DKPro-Wrapper Output Format (DOF) Specification

Fotis Jannidis, Stefan Pernes, Steffen Pielström, Isabella Reger, Nils Reimers, Thorsten Vitt

Approaches to Automatic Text Structuring

Nicolai Erbs
September 2015

Counting What Counts: Decompounding for Keyphrase Extraction

Nicolai Erbs, Pedro Bispo Santos, Torsten Zesch, Iryna Gurevych
In: Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction, p. 10-17, July 2015
Association for Computational Linguistics
[Online-Edition: Technische Universität Darmstadt, 2015]

Identifying Argumentative Discourse Structures in Persuasive Essays

Christian Stab, Iryna Gurevych
In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), p. 46-56, October 2014
Association for Computational Linguistics

A broad-coverage collection of portable NLP components for building shareable analysis pipelines

Richard Eckart de Castilho, Iryna Gurevych
In: Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT (OIAF4HLT) at COLING 2014, p. 1--11, August 2014
Association for Computational Linguistics and Dublin City University
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang