Sichere Dokumente durch individuelle Markierungen (SIDIM)
Electronic documents can be copied and distributed via the web - uncontrollable for the copyright holders. Especially if the documents are of commercial value, the copyright holder might choose to avoid the use of modern distribution platforms in fear of illegal distribution as pdf or html via webpages, forums or p2p networks. A prominent example is the eBook. A solution to this problem is the individualisation of the documents through visible and inivisible marks that make copies differentiable. This solution, also referred to as watermarking, offers the possibility of tracing back the user responsible for copyright infringement and thus can effectively protect the copyright holders and prevent illegal distribution. SiDiM focusses on the research and development of innovative and efficient individualisation mechanisms on the basis of natural language watermarks.
The primary goal of the project is to develop novel methods that individualize electronic documents through the manipulation of their textual content that is unrecognizable by a reader. The marks are supposed to be difficult to remove, and at the same time to have no recognizable affect to the meaning of the content. This solution will be embedded in an electronic document distribution environment and remain transparent to an end user. Therefore interfaces and parsers to access the content of electronic documents, new export functions and file formats have to be created. In order to achieve sufficient quality on the textual level, new approaches in text analysis and paraphrasing will be investigated. Different methods will be evaluated to be effectively applied to different document types. In the project, UKP Lab will evaluate existing textwatermarking approaches and develop new ones for paraphrasing texts. In particular, UKP will also investigate the potential of using wikipedia revision edit histories for paraphrasing.