While a significant amount of knowledge today is already available in structured form in databases or as part of the semantic web, most knowledge still is recorded in unstructured form as natural language artifacts such as text documents, audio or video recordings. The Unstructured Information Management (UIMA) framework, originally developed by IBM, offers a platform to impose structure on unstructured data, and thus facilitates the extraction of knowledge from unstructured sources.
This project addresses changing topics from the areas of natural language processing, information extraction, information retrieval, and semantic knowledge processing.
The Darmstadt Knowledge Processing Software Repository (DKPro) provided by UKP offers a set of ready-to-use Java libraries for analysis and indexing. The project will be implemented on top of the Apache Unstructured Information Management (UIMA) framework.
Introductory sessions will be held during the first three weeks Thursdays (14.04., 21.04., 28.04.) from 15:30 to 18:30 in S2|02 D017. Each session consists of a lecture part taking followed by an exercise part.
Brief regular status meetings are (tentatively) planned for Thursdays between 15:30 to 18:30. Actual times may vary depending on the number of participants.