DKPro Core

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. The provided components wrap a constantly growing set of stand-of-the-art NLP tools and also include several original components covering a wide range of tasks including: tokenization/segmentation, compound splitting, stemming, part-of-speech tagging, lemmatization, constituency parsing, dependency parsing, named entity recognition, coreference resolution, language identification, spelling correction, grammar checking, and support for reading and writing various file and corpus formats.


DKPro Core relies heavily on uimaFIT and is meant to be used with Apache Maven. The main components are hosted on Maven Central, while distributable models are available from the public Maven repository at UKP Lab.

Documentation, source code and further instructions regarding DKPro Core can found on the GitHub Project Site.

The source is code is provided under different licenses, depending on the DKPro Core component:

  • DKPro Core ASL components use the Apache Software License 2.0
  • DKPro Core GPL components use the GNU Public License 3.0


Approaches to Automatic Text Structuring
Nicolai Erbs
September 2015.

Counting What Counts: Decompounding for Keyphrase Extraction
Nicolai Erbs and Pedro Bispo Santos and Torsten Zesch and Iryna Gurevych
In: Association for Computational Linguistics: Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction, p. 10-17, July 2015.
Technische Universität Darmstadt, 2015.

Identifying Argumentative Discourse Structures in Persuasive Essays
Christian Stab and Iryna Gurevych
In: Alessandro Moschitti and Bo Pang and Walter Daelemans: Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), p. 46-56, Association for Computational Linguistics, October 2014.

A broad-coverage collection of portable NLP components for building shareable analysis pipelines
Richard Eckart de Castilho and Iryna Gurevych
In: Nancy Ide and Jens Grivolla: Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT (OIAF4HLT) at COLING 2014, p. 1--11, Association for Computational Linguistics and Dublin City University , August 2014.

Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia
Johannes Daxenberger and Iryna Gurevych
In: Kristina Toutanova and Hua Wu: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 187-192, Association for Computational Linguistics, June 2014.

Natural Language Processing: Integration of Automatic and Manual Analysis
Richard Eckart de Castilho
February 2014.

Hierarchy Identification for Automatically Generating Table-of-Contents
Nicolai Erbs and Iryna Gurevych and Torsten Zesch
In: Galia Angelova and Kalina Bontcheva and Ruslan Mitkov: Proceedings of 9th Conference on Recent Advances in Natural Language Processing (RANLP 2013), p. 252-260, INCOMA Ltd., September 2013. ISSN 1313-8502.

