DKPro Text Classification

DKPro TC (Text Classification) is a UIMA-based text classification framework built on top of DKPro Core, DKPro Lab and several machine learning frameworks (e.g. the Weka Machine Learning Toolkit). It is intended to alleviate supervised machine learning experiments with any kind of textual data.

DKPro TC comes with

  • Getting-started example code for standard text collections, e.g. the Reuters-21578 Text Categorization corpus, in Java and Groovy
  • many generic feature extractors, e.g. n-grams, POS-tags etc.
  • convenient parameter optimization capabilities
  • comprehensive reporting with support for many standard performance measures
  • support for single- and multi-label classification as well as pair-wise document classification.


Major parts of the source code are provided under the Apache Software License (ASL) version 2.


Additional Attributes


What is the Essence of a Claim? Cross-Domain Claim Identification

Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, Iryna Gurevych
April 2017

Domain-Specific Aspects of Scientific Reasoning and Argumentation: Insights from Automatic Coding

Johannes Daxenberger, Andras Csanadi, Christian Ghanem, Ingo Kollar, Iryna Gurevych
In: Scientific Reasoning and Argumentation: Domain-Specific and Domain-General Aspects, p. to appear, 2017
Taylor & Francis

Turbulent Stability of Emergent Roles: The Dualistic Nature of Self-Organizing Knowledge Co-Production

Ofer Arazy, Johannes Daxenberger, Hila Lifshitz-Assaf, Oded Nov, Iryna Gurevych
In: Information Systems Research, Vol. 27, p. 792-812, December 2016

Automated Text Classification to Capture Scientific Reasoning and Argumentation Processes in Different Professional Problem Solving Contexts

Andras Csanadi, Johannes Daxenberger, Christian Ghanem, Ingo Kollar, Frank Fischer, Iryna Gurevych
July 2016

UKPDIPF: A Lexical Semantic Approach to Sentiment Polarity Prediction in Twitter Data

Lucie Flekova, Oliver Ferschke, Iryna Gurevych
In: Semeval-2014 Task 9: Sentiment Analysis in Twitter. Proceedings of the 8th International Workshop on Semantic Evaluation, p. 704-710, August 2014
Association for Computational Linguistics and Dublin City University

DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data

Johannes Daxenberger, Oliver Ferschke, Iryna Gurevych, Torsten Zesch
In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, p. 61-66, June 2014
Association for Computational Linguistics

Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Johannes Daxenberger, Iryna Gurevych
In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 187-192, June 2014
Association for Computational Linguistics
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang