Workshop Recordings

On 13 June 2023, we organised an afternoon of Lectures on Language Technology and Society at ITU. All the recordings from the workshop are now available on YouTube.
On 19 August 2022, we organised an on-line Workshop on Pronouns and Machine Translation with a series of exciting talks. All the recordings from the workshop are available on YouTube.

Multilingual Annotated Corpus Resources

ParCorFull: A Parallel Corpus Annotated with Full Coreference
ParCorFull is a parallel corpus of texts in English and German manually annotated for coreference. The corpus consists of TED talks and some news articles.

Citation:
Ekaterina Lapshinova-Koltunski, Christian Hardmeier and Pauline Krielke. ParCorFull: a Parallel Corpus Annotated with Full Coreference. Proc. 11th Conference on Linguistic Resources and Evaluation (LREC), Miyazaki JP, May 2018, pp. 423–428.
ParCor 1.0: A Parallel Pronoun-Coreference Corpus
ParCor is a parallel corpus of texts in English and German annotated for pronoun coreference (partial coreference chains linking pronouns to their closest antecedents). The TED talks in this corpus and their annotations are a subset of those in ParCorFull. Additionally, the corpus contains a number of publications from the EU Bookshop which are not included in ParCorFull.

Citation: Liane Guillou, Christian Hardmeier, Aaron Smith, Jörg Tiedemann and Bonnie Webber. ParCor 1.0: A Parallel Pronoun-Coreference Corpus to Support Statistical MT. Proc. 10th International Conference on Language Resources and Evaluation (LREC), Reykjavík IS, May 2014, pp. 3191–3198.

DiscoMT 2015 Shared Task on Pronoun Translation

Citation:
Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley and Mauro Cettolo. Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation.. Proc. 2nd Workshop on Discourse and Machine Translation (DiscoMT), Lisbon PT, September 2015, pp. 1–16.
WMT 2016 Shared Task on Cross-Lingual Pronoun Prediction

Citation:
Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber and Andrei Popescu-Belis. Findings of the 2016 WMT shared task on cross-lingual pronoun prediction. Proc. 1st Conference on Machine Translation (WMT), Berlin DE, August 2016, pp. 525–542.
DiscoMT 2017 Shared Task on Cross-Lingual Pronoun Predictions

Citation:
Sharid Loáiciga, Sara Stymne, Preslav Nakov, Christian Hardmeier, Jörg Tiedemann, Mauro Cettolo and Yannick Versley. Findings of the 2017 DiscoMT shared task on cross-lingual pronoun prediction. Proc. 3rd Workshop on Discourse in Machine Translation (DiscoMT), Copenhagen DK, September 2017, pp. 1–16.

PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation
This is a test suite for manual or semi-automatic evaluation of pronouns in English-French MT.

Citation:
Liane Guillou and Christian Hardmeier. PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation. Proc. 10th International Conference on Language Resources and Evaluation (LREC), Portorož SI, May 2016, pp. 636–643.
Graphical User Interface for the PROTEST Test Suite
This is a user interface to facilitate manual evaluation of pronouns using the PROTEST test suites. Contact me if you’d like to use it so I can help you set it up.

Citation:
Christian Hardmeier and Liane Guillou. A graphical pronoun evaluation tool for the PROTEST pronoun evaluation test suite. Proc. 19th Annual Conference of the European Association for Machine Translation (EAMT), Riga LV. Baltic Journal of Modern Computing 4 (2), May 2016, pp. 318–330.
AutoPRF Pronoun Evaluation Tool
This is a tool to calculate the precision and recall of pronoun translations in MT output by scoring automatically against a reference translation. See also our paper at EMNLP 2018 for a discussion of this method and a comparison with other approaches.

Citation:
Christian Hardmeier and Marcello Federico. Modelling Pronominal Anaphora in Statistical Machine Translation. Proc. 7th International Workshop on Spoken Language Translation (IWSLT), Paris FR, December 2010, pp. 283–289.