In times of increased availability of text resources and computational power, it becomes possible to identify structure in language data without manual annotation. Be it in the context of search engines or machine translation, models that are learned from data alone can significantly improve performance for natural language tasks by folding in background knowledge of language hat has been acquired by only looking at (a large amount of) text.
For this, we have to review concepts of language statistics, language modeling and clustering, as they form the backbone of many works in this area.
Selected topics include: Unsupervised part-of-speech tagging, unsupervised morphology, word sense induction, acquisition of semantic relations, topic models, unsupervised parsing, latent semantic analysis.
The seminar provides detailed coverage of current techniques, their strengths and limitations, and current research directions by including recent research papers. In the course of the seminar, students will acquire key skills like the fundamentals in academic research and scientific writing, and they will be encouraged to improve their presentation skills.
This seminar is being held in the format of a Mini-Workshop: After an introductory lecture, individual topics are assigned, Introductory literature is provided by topic. Students write a paper, consisting of a literature overview and a description of an own experiment. Papers are mutually peer-reviewed. In a final workshop, the work is presented in a 20 minute presentation.
Literature is distributed by topic.
Each student is expected to
The
course management system is used as the primary communication platform for the seminar and also contains any related material. The access key will be provided in the first seminar session.
For general advice on presenting your topic, please have a look at these guidelines.
The seminar takes place Tuesdays, 15:20 - 17:00, S2|02 C120.
The papers and the presentation slides of all participants that have given their permission to publish their materials are given below.