Community-based Question Answering for Educational Information

Motivation

In the age of life-long learning, the amount of educational data provided by expert services and community-based question-and-answer (QA) pages on the Web is growing fast. Although these pages provide useful information, benefiting from them is not always easy for the users. They need to go through various educational information services and query each of them individually, which entails a lot of effort on their side. They also need to figure out which of the available web pages and services is reliable and provides high-quality information. Automatically analyzing such information on the Web will help the users to access the required pieces of information with minimal effort. To this aim, we create an automatic question answering system which searches through the available educational information sources to answer the users' questions.

 

Goals

The basic goal of this project is to answer user questions on various educational topics. For instance “How can I do voluntary service abroad?” or “Where can I get information on studying Mathematics?” Since a large portion of users' questions have already been asked by other people and answered by experts or crowds, we use the available question and answer archives to answer these questions. The resulting system will have an interface that takes natural language questions, retrieves the requested information from various heterogeneous information sources from the web, and efficiently presents it to the user as a filtered, summarized, and quality-assessed answer. In case the system is not able to retrieve an answer, it could automatically post the question to the various community-based QA sites used in the project so that the crowd will provide an answer later on.

Another goal of the project is to investigate the impact of semantic information on community-based QA. A Semantic Role Labeling processor converts source texts into a shallow semantic representation, which provides a useful level of abstraction for other QA tasks. The system is developed for German and English, which assures that the principles and decisions used in the system can be transferred to other languages as well.


 

 

 

Methods

1 - Interface for natural language questions

Using this interface, the users can post their questions and specify the educational information sources that should be used to answer their questions. In addition, they can define the critera that should be used to assess the quality of the answers.

 

2 - Text processing and retrieval

This part of the project collects and analyzes all pairs of question and answers that are available in FAQ collections and social QA forums. The analyzed data is then used during the online search. After issuing a new query by the user, the system uses paraphrase recognition and information retrieval techniques to find similar questions that have already been asked and are available in the archive. Their answers are potential answers to the newly posted question. In addition, the system uses information retrieval methods to search through answers and retrieve relevant answers. The answer retrieval part is useful for the texts that can answer the input question, but their corresponding questions are not similar to the input question. A novel graph-based technique is used for finding relevant passages in source data.

 

3 - Quality assessment

The quality assessment of the system considers two different perspectives: text quality and reliability. The answers are re-ranked based on the results of the quality assessment and the quality criteria defined by the user.

 

4 - Answer summarization

The output of the system is a summarized set of answers to the user's question together with information on their source pages and their quality scores.

 

5 - Semantic Role Labeling

One of the project's objectives is to evaluate the impact of semantic information on community-based question answering. Semantic Role Labeling is the task of automatically inferring shallow semantic interpretations, that describe input texts in terms of events and their participants (e.g. Who did what to whom). This information is then used to improve the quality of question categorization, answer retrieval and summarization. There exist several theoretical frameworks for Semantic Role Labeling with different description granularity level. The choice of granularity is a trade-off between semantic richness and processing quality, and it is important to investigate which representation suits the question answering task best. Adapting existing SRL systems to the non-standard language often used in Q&A poses an additional challenge.

 

Team

Former Staff:

  • Dr. Saeedeh Momtazi, Postdoctoral Researcher (associated, on leave)
  • Silvana Hartmann, Doctoral Researcher
  • Yan Shao, Project Staff

Publications

Additional Attributes

Type

Real-Time News Summarization with Adaptation to Media Attention

Andreas Rücklé, Iryna Gurevych
In: Proceedings of the 11th Conference on Recent Advances in Natural Language Processing (RANLP 2017), p. (to appear), September 2017
[Inproceedings]

Representation Learning for Answer Selection with LSTM-Based Importance Weighting

Andreas Rücklé, Iryna Gurevych
In: Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017), p. (to appear), September 2017
[Online-Edition: https://github.com/UKPLab/iwcs2017-answer-selection]
[Inproceedings]

Context-Aware Representations for Knowledge Base Relation Extraction

Daniil Sorokin, Iryna Gurevych
In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), p. 1785-1790, September 2017
Association for Computational Linguistics
[Online-Edition: https://github.com/UKPLab/emnlp2017-relation-extraction]
[Inproceedings]

End-to-End Non-Factoid Question Answering with an Interactive Visualization of Neural Attention Weights

Andreas Rücklé, Iryna Gurevych
In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations (ACL 2017), p. 19-24, August 2017
Association for Computational Linguistics
[Online-Edition: https://github.com/UKPLab/acl2017-non-factoid-qa]
[Inproceedings]

Out-of-domain FrameNet Semantic Role Labeling

Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, Iryna Gurevych
In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), p. 471-482, April 2017
Association for Computational Linguistics
[Inproceedings]

LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test

Michael Bugert, Yevgeniy Puzikov, Andreas Rücklé, Judith Eckle-Kohler, Teresa Martin, Eugenio Martínez Cámara, Daniil Sorokin, Maxime Peyrard, Iryna Gurevych
In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem, held in conjunction with EACL2017), p. 56-61, April 2017
Association for Computational Linguistics
[Online-Edition: https://github.com/UKPLab/lsdsem2017-story-cloze]
[Inproceedings]

Assessing SRL Frameworks with Automatic Training Data Expansion

Silvana Hartmann, Éva Mújdricza-Maydt, Ilia Kuznetsov, Iryna Gurevych, Anette Frank
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, p. 115--121, April 2017
Association for Computational Linguistics
[Inproceedings]

End-to-end Representation Learning for Question Answering with Weak Supervision

Daniil Sorokin, Iryna Gurevych
In: ESWC 2017 Semantic Web Challenges, p. (to appear), 2017
[Inproceedings]

Funding

This project is funded by Deutsche Forschungsgemeinschaft (German Research Foundation).

A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang