Predicting and Manipulating Exercise Difficulty for Language Learning


In a labor market that is increasingly globalized, knowledge of one or even more than one foreign language is more relevant than ever before. Due to increased mobility, multilingual skills are also required for private communication as friendships stretch across geographical and linguistic borders.

At the same time, learners experience that the acquired basic foreign language skills deteriorate quickly if they are not trained and improved on a regular basis. However, the static time frame of conventional language courses is often not compatible with the learners’ unstable working conditions and lifestyles. Therefore, many learners turn to online portals for self-directed learning. These portals are becoming increasingly more popular although the provided contents are rather inflexible and limited. So far, adaptive technologies that individually adjust contents to the learners‘ proficiency level, their speed of progress and their learning style are at an early stage of development. In order to generate adaptive exercises with varying difficulty, we need to be able to measure difficulty automatically. In this project, we have developed measures for predicting and manipulating the difficulty of texts, words and exercises for language learners.

Text Difficulty

Research on text difficulty is commonly approximated under the concept of readability. The readability of a text is a measure of the text complexity. Higher readability scores indicate that a text can be comprehended more easily. A manual is expected to have a higher level of readability than a philosophical thesis, for example.  Traditional approaches for measuring readability only take the average sentence length and the average word length into account. This is an insufficient approach because it does not take other factors such as lexical-semantic difficulty (e.g. choice of words), syntactic difficulty (e.g. grammatical constructions), and discourse difficulty (e.g. cohesion and coherence) into account. In addition, most readability approaches focus on native speakers of English.

We discuss the range of readability features and their applicability to language learners (L2 readability) and to other languages in:

  • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: Towards fine-grained readability measures for self-directed language learning, in: Proceedings of the Swedish Language Technology Conference: Workshop on NLP for CALL, Vol. 80 (2), S. 11-19, Lund, Schweden, Oktober 2012.
  • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: ‘Readability for foreign language
    learning: The importance of cognates’, in: International Journal of Applied Linguistics,
    Vol. 165 (2), S. 136–162, 2014.

Most of the discussed features have been implemented and are available in dkpro-tc-readability

Word Difficulty

For language learners, the difficulty of single words can lead to severe comprehension problems. The process of learning the basic syntactic structures of an L2 can be considered to be more or less completed at a certain point, but vocabulary acquisition is a continuous process that remains important even for advanced learners. However, assessing whether a learner knows a word is not trivial because word knowledge consists of many factors such as knowledge of the spoken form, the written form, grammatical behavior, collocation behavior, and many others. In this project, we focused on two aspects that are particularly relevant for language learners: cognateness and spelling difficulty.

  1. Cognateness
    Cognates are words that are similar in different languages, e.g. “elegance“ in English and “Eleganz“ in German. These words can be particularly helpful for  language learners when attempting to comprehend an unknown text. Even if a learner has never seen a foreign word before, she might be able to guess the meaning due to the similarity to words in another language. A list of cognate pairs would thus constitute an important resource for automated exercise generation.
    Cognates often follow regular production patterns, e.g. the pairs “ignorance-Ignoranz“, “tolerance-Toleranz” and “redundance-Redundanz” are similar to the “elegance-Eleganz” example. Such regularities enable the application and modification of methods from the field of statistical machine translation for cognate production.

    Workflow for cognate production:

    This approach is described in:

    • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: ‘Cognate Production using Character-based Machine Translation’, in: Proceedings of the Sixth International JointConference on Natural Language Processing (IJCNLP), S. 883-891, Nagoya, Japan, Oktober 2013.

    Data and models can be found here.

  2. Spelling Difficulty
    Cognates facilitate comprehension, but they often lead to spelling errors. We developed an approach to predict the spelling difficulty of words based on word familiarity features and phonetic features. We evaluated the approach on spelling errors extracted from learner corpora and found that the L1 of the learners has a strong influence on spelling difficulty.
  3. The approach is described in:

    • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: ‘Predicing the Spelling Difficulty of Words for Language Learners’,  in: Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications held in conjunction with NAACL 2016: to apear, San Diegao , California, USA, June 2016.The

    The data is available here.

Exercise Difficulty

Text and word difficulty have a strong influence on the difficulty of text-based exercises. In addition to these content factors, the format of the exercise also plays a role. In this project, we developed a model to predict the difficulty of text-completion exercises. A text-completion exercise is a text in which some words have been completely or partially replaced by a gap. In order to solve a text-completion exercise, the learner needs to fill in the gaps. Exercises differ with respect to the gap format which has an influence on the candidate ambiguity and the deletion rate which has an influence on the item dependencies.

  • Difficulty Model:

This difficulty model has been applied to C-tests, X-tests and cloze tests in English, French and German. The approach and the results can be found in:

  • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: ‘Predicting the Difficulty of
    Language Proficiency Tests’, in: Transactions of the Association for Computational
    Linguistics (TACL)
    , Vol. 2 (1), S. 517–529, November 2014.
  • Lisa Beinborn and Torsten Zesch and Iryna Gurevych: ‘Candidate Evaluation Strategies
    for Improved Difficulty Prediction of Language Tests’, in: Proceedings of the Tenth
    Workshop on Innovative Use of NLP for Building Educational Applications
    held in
    conjunction with NAACL 2015: S. 1–11, Denver, Colorado, USA, Juni 2015.

The data and models can be found here.

Manipulating Difficulty

Based on the difficulty prediction approaches, we can manipulate the difficulty of exercises. In this project, we evaluated two manipulation directions: content selection and distractor substitution.

  1. Content selection
    We provide a web demo for our difficulty prediction approach that allows test designers to instantaneously approximate the difficulty of a C-test for a chosen text. In a second step, we applied this approach on a text corpus to select appropriate texts for exercises. An evaluation with human experts shows that automatic content selection can be a useful tool, but topic preferences should be taken into account. The generated C-tests and the evaluation results can be found undefinedhere.
  2. Distractor manipulation
    The difficulty of cloze exercises is mainly determined by the choice of the distractors. We analyze whether our evaluation strategies for candidate ambiguity can be used to find substitutions for distractors that increase or decrease the difficulty of the exercise. In addition, we provide a substitution dataset that contains noun synonyms extracted from the lexical resource Uby that have been enriched with cognateness and spelling difficulty information. The dataset can be found undefinedhere.


Project Publications

Additional Attributes


Predicting the Difficulty of Language Proficiency Tests

Lisa Beinborn, Torsten Zesch, Iryna Gurevych
In: Transactions of the Association for Computational Linguistics, Vol. 2, p. 517--529, November 2014

Cognate Production using Character-based Machine Translation

Lisa Beinborn, Torsten Zesch, Iryna Gurevych
In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, p. 883-891, October 2013
Asian Federation of Natural Language Processing
[Online-Edition: 8134]
A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang