Wikulu - Self-Organizing Wikis

Project Goals

The importance of web-based collaboration systems called Wikis has grown tremendously over the last years, e.g. Wikipedia, corporate wikis. As the usability of a wiki is initially very high, the amount of content grows very fast. A common drawback of wikis is however that the usability decreases with the increased content amount. The Wikulu - Self-Organizing Wikis project at the UKP Lab employs the latest Natural Language Processing (NLP) technologies to manage unstructured information, i.e. to structure the content in corporate Wikis. The objective of the project is thus to implement intelligent approaches to assist the user while creating, editing, or searching content. Wikulu should relieve the user of manual information management, leaving more room for productive work. Why is it called Wikulu? Kukulu is Hawaiian for to organize!

Project Publications

Hierarchy Identification for Automatically Generating Table-of-Contents

Author Nicolai Erbs, Iryna Gurevych, Torsten Zesch
Date September 2013
Kind Inproceedings
Editor Galia Angelova, Kalina Bontcheva, Ruslan Mitkov
PublisherINCOMA Ltd.
AddressShoumen, Bulgaria
Book titleProceedings of 9th Conference on Recent Advances in Natural Language Processing (RANLP 2013)
LocationHissar, Bulgaria
Research Areas Ubiquitous Knowledge Processing, Knowledge Discovery in Scientific Literature, UKP_a_NLP4Wikis, UKP_p_WIWEB, UKP_p_WIKULU, reviewed, UKP_s_JWPL, UKP_s_DKPro_Lab, UKP_s_DKPro_Core, UKP_p_openwindow
Abstract A table-of-contents (TOC) provides a quick reference to a document’s content and structure. We present the first study on identifying the hierarchical structure for automatically generating a TOC using only textual features instead of structural hints e.g. from HTML-tags. We create two new datasets to evaluate our approaches for hierarchy identification. We find that our algorithm performs on a level that is sufficient for a fully automated system. For documents without given segment titles, we extend out work by auto matically generating segment titles. We make the datasets and our experimental framework publicly available in order to foster future research in TOC generation.
Full paper (pdf)
[Export this entry to BibTeX]

Important Copyright Notice:

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


We are always looking for students who are interested in Wikulu and want to help us with our programming and research tasks. Please contact us if you want to know more!

Related Projects

The Wikulu project builds upon cutting-edge fundamental NLP technologies developed at UKP Lab to solve real-life knowledge management problems. It builds upon several successful projects ongoing at the UKP Lab, such as:

  • WiWeb funded by the Förderinitiative Interdisziplinäre Forschung: Utilizing Web Knowledge: Language Technologies and Psychological Processes
  • SIR 1+2 funded by the German Research Foundation (DFG): Extracting structured lexical semantic knowledge from wiki-based web 2.0 sources such as Wikipedia and Wiktionary and integrating contextually-aware semantic relatedness into information retrieval and keyphrase extraction
  • DKPro funded by UIMA 2007 Innovation Award and by two UIA 2008 Innovation Awards from IBM: Integrating NLP components in a repository of semantic information management software based on an industrial strength IBM’s Unstructured Information Management Architecture (UIMA) framework


The Wikulu - Self-Organizing Wikis project is funded by the Klaus Tschira Foundation.


A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang