The UKPConvArg1 Corpus is introduced in the following paper:

Habernal, I. & Gurevych, I. (2016). Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Pages: 1589-1599. Berlin, Germany. Association for Computational Linguistics.

Corpus content

  • UKPConvArg1-full-XML

    • This is the full corpus as referred in the article (Table 2, UKPConvArgAll). It contains 32 xml files, each file corresponding to one debate/side. Total number of argument pairs is 16,081.

  • UKPConvArg1-Ranking-CSV

    • Exported tab-delimited file with 1,052 arguments with their ID, rank score, and text (Table 2, UKPConvArgRank)

  • UKPConvArg1Strict-XML

    • Cleaned version used for experiments in the article (Table 2, UKPConvArgSctrict). It contains 11,650 argument pairs in 32 XML files.

  • UKPConvArg1Strict-CSV

    • The same as UKPConvArg1Strict-XML but exported into tab-delimited CSV with ID, more convincing argument label (a1 or a2) and both arguments (a1, tab, a2)


  • The data are licensed under CC-BY (Creative Commons Attribution 4.0 International License)
  • The source arguments originate from

