DKPro Text Classification

DKPro TC (Text Classification) is a UIMA-based text classification framework built on top of DKPro Core, DKPro Lab and several machine learning frameworks (e.g. the Weka Machine Learning Toolkit). It is intended to alleviate supervised machine learning experiments with any kind of textual data.

DKPro TC comes with

  • Getting-started example code for standard text collections, e.g. the Reuters-21578 Text Categorization corpus, in Java and Groovy
  • many generic feature extractors, e.g. n-grams, POS-tags etc.
  • convenient parameter optimization capabilities
  • comprehensive reporting with support for many standard performance measures
  • support for single- and multi-label classification as well as pair-wise document classification.

Downloads

Major parts of the source code are provided under the Apache Software License (ASL) version 2.

A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang