Software
UKP does release software as a service to the academic community. In general, we would like to know who is using our software, so that we can get in touch with the users regarding updates, bug fixes, and the like. Below is a list of our current software packages. If you download any of them, please remember to write an email to the assigned contact person. Thank you!
Darmstadt Knowledge Processing Repository
New: The UKP group receives IBM's UIMA Innovation Award 2007 for their DKPro Repository proposal! Read the news item (in German). The
improved and extended DKPro Repository will soon be made available to the research community on this web site. Read more about the proposed improvements.
Numerous powerful and state-of-the-art NLP components are already freely available in the NLP research community. New and improved components are being developed and made available all the time. The components cover the whole range of NLP-related processing tasks, including tokenization, sentence splitting, POS-tagging, and lemmatization, but also more complex tasks like parsing, lexical chain detection, topic segmentation, and the like. However, most of the components are realized as stand-alone applications without a common and well-defined framework. This is problematic when it comes to integrating components into a processing pipeline for a particular task.
Read more...
JWPL - Java-based Wikipedia API
Lately, Wikipedia has been recognized as a promising lexical semantic resource. If Wikipedia is to be used for large-scale NLP tasks, efficient programmatic access to the knowledge therein is required.
JWPL (Java Wikipedia Library) is a free, Java-based application programming interface that allows to access all information contained in Wikipedia.
Read more...
JWKTL - Java Wiktionary Library
JWKTL (Java Wiktionary Library) is a free, Java-based application programming interface that allows to access the information contained in Wiktionary.
DEXTRACT - Creating datasets for evaluating semantic relatedness
DEXTRACT is a tool for semi-automatically creating word pair datasets. Such datasets are frequently used for evaluating semantic relatedness measures, but creating them manually is a tedious work.





