Secure Documents using Individual Markers

Sichere Dokumente durch individuelle Markierungen (SIDIM)


Electronic documents can be copied and distributed via the web - uncontrollable for the copyright holders. Especially if the documents are of commercial value, the copyright holder might choose to avoid the use of modern distribution platforms in fear of illegal distribution as pdf or html via webpages, forums or p2p networks. A prominent example is the eBook. A solution to this problem is the individualisation of the documents through visible and inivisible marks that make copies differentiable. This solution, also referred to as watermarking, offers the possibility of tracing back the user responsible for copyright infringement and thus can effectively protect the copyright holders and prevent illegal distribution. SiDiM focusses on the research and development of innovative and efficient individualisation mechanisms on the basis of natural language watermarks.


The primary goal of the project is to develop novel methods that individualize electronic documents through the manipulation of their textual content that is unrecognizable by a reader. The marks are supposed to be difficult to remove, and at the same time to have no recognizable affect to the meaning of the content. This solution will be embedded in an electronic document distribution environment and remain transparent to an end user. Therefore interfaces and parsers to access the content of electronic documents, new export functions and file formats have to be created. In order to achieve sufficient quality on the textual level, new approaches in text analysis and paraphrasing will be investigated. Different methods will be evaluated to be effectively applied to different document types. In the project, UKP Lab will evaluate existing textwatermarking approaches and develop new ones for paraphrasing texts. In particular, UKP will also investigate the potential of using wikipedia revision edit histories for paraphrasing.

System Architecture

Project Publications

Additional Attributes


Uncertainty Detection for Natural Language Watermarking

György Szarvas, Iryna Gurevych
In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, p. 1188-1194, October 2013
Asian Federation of Natural Language Processing

Supervised All-Words Lexical Substitution using Delexicalized Features

György Szarvas, Chris Biemann, Iryna Gurevych
In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), p. 1131-1141, June 2013
Association for Computational Linguistics

Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation

Tristan Miller, Chris Biemann, Torsten Zesch, Iryna Gurevych
In: Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012), p. 1781-1796, December 2012

Cross-Genre and Cross-Domain Detection of Semantic Uncertainty

György Szarvas, Veronika Vincze, Richárd Farkas, György Móra, Iryna Gurevych
In: Computational Linguistics, Vol. 38, p. 335--367 , June 2012


A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang