All the recordings from the workshop are now available on YouTube. Find them here:

We are delighted to invite you to our public Workshop on Pronouns and Machine Translation, which will be held on-line on 19 August 2022. The workshop features a series of lectures on recent work related to understanding, modelling and evaluating pronouns and other discourse-level phenomena in neural machine translation and a panel discussion. It is organised as a part of our research project on Neural Pronoun Models for Machine Translation, which is funded by the Swedish Research Council (grant 2017-930) and will conclude at the end of this year.


Time (UTC+2) Speaker

Sheila Castilho, ADAPT Centre, Dublin City University
The DELA Project: what do we know about MT evaluation with context?

The challenge of evaluating translations in context has been raising interest in the machine translation (MT) field. However, the definition of what constitutes a document-level (doc-level) MT evaluation, in terms of how much of the text needs to be shown, is still unclear (Castilho et al., 2020). Few works have taken into account doc-level human evaluation (Barrault et al., 2020), and one common practice is the usage of test suites with context-aware markers. But why do we need context in MT evaluation? Document-level evaluation of MT allows for a more thorough examination of the output quality with context, avoiding common cases of misevaluations. The main objective of the DELA Project is to define best practices for doc-level MT evaluation, and test the existing human and automatic sentence-level evaluation metrics to the doc-level. In this talk, we will look into the results of the project so far regarding methodologies for adding context in MT evaluation tasks, a document-level corpus annotated with context-related issues, and how much context span is enough to solve those issues.


Deyi Xiong, Tianjin University
Title: Modeling Cohesion Devices for Context-Aware Neural Machine Translation

Text is written cohesively, in which sentences are connected via grammatical or lexical cohesion devices. In this talk, I will present our recent efforts that attempt to model and incorporate cohesion devices into context-aware neural machine translation. In the first work, I will introduce a document-level neural machine translation framework, CoDoNMT, which models cohesion devices from two perspectives: Cohesion Device Masking (CoDM) and Cohesion Attention Focusing (CoAF). The former forces NMT to predict masked cohesion devices with inter-sentential context information. The latter attempts to guide the model to pay exclusive attention to relevant cohesion devices in the context when translating cohesion devices in the current sentence. In the second work, we conduct a deep analysis of a dialogue corpus and summarize three major issues on dialogue translation, including pronoun dropping (ProDrop), punctuation dropping (PunDrop), and typos (DialTypo). In response to these challenges, we propose a joint learning method to identify omission and typo in the process of translating, and utilize context to translate dialogue utterances.


Kayo Yin, DeepMind/UC Berkeley
Understanding, Improving and Evaluating Context Usage in Context-aware Translation

Context-aware Neural Machine Translation (NMT) models have been proposed to perform document-level translation, where certain words require information from the previous sentences to be translated accurately. However, these models are unable to use context adequately and often fail to translate relatively simple discourse phenomena. In this talk, I will discuss methods to measure context usage in NMT by using human annotations and conditional cross mutual information, as well as training methods to improve context usage by supervising attention and performing contextual word dropout. I will also discuss ways to identify words that require context to translate and how to evaluate NMT models on these ambiguous phenomena, and present open challenges in document-level translation.


Prathyusha Jwalapuram, Nanyang Technological University
Benchmarking Context-Aware MT Systems: Testsets and Evaluation Measure for Pronominal Anaphora

The neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Despite increasing instances of machine translation systems including contextual information, the evidence for translation quality improvement is sparse, especially for discourse phenomena. Most of these phenomena go virtually unnoticed by traditional automatic evaluation measures such as BLEU. Such metrics are not expressive or sensitive enough to capture quality improvements or drops that are minor in size but significant in perception. I will present testsets and a model-based evaluation measure for pronominal anaphora, and highlight the need for performing such fine-grained evaluation. Our model-based evaluation measure utilizes positive and negative instances of data to train a model that can rank texts in terms of pronoun translation quality. I will then present benchmarking results for several context-aware machine translation models using these testsets and evaluation measures, showing that the performance is not always consistent across languages.


Panel discussion

14:00-14:30 Break

Christian Hardmeier, Uppsala University/IT University of Copenhagen
Understanding Pronouns in Translation: Resources, Methods, Challenges and Insights

The difficulty of pronoun translation is typically illustrated with examples of anaphoric pronouns requiring gender agreement in the target language. However, pronoun translation is more complex than that. In this talk, I present our efforts to understand the generation and interpretation of pronouns in translation. A core outcome of our project is the ParCorFull corpus, a multilingual parallel data resource with a rich annotation of coreferential phenomena going beyond simple anaphoric references. ParCorFull has found a range of applications to the cross-lingual study of texts, to machine translation evaluation, leading to insights into translation processes, but also uncovering challenges due to how corpus annotation resolves ambiguity, potentially creating conflicts in a parallel data. Additional insights can be gained from studies of pronoun generation and interpretation we've conducted with human participants, highlighting the variance of typical patterns across five European languages. By comparing artifically contrived prompts with stimuli derived from ParCorFull, we find indications that pronoun use and interpretation is sensitive to the perceived register of the prompts.


Gongbo Tang, Uppsala University
Cross-lingual coreference resolution models and pronoun translation with mention attention

I will present our work in two parts: cross-lingual coreference resolution models and pronoun translation with mention attention. Part 1: Neural coreference resolution models only utilize monolingual data for training. We propose a simple yet effective model to exploit coreference knowledge from parallel data. Our proposed cross-lingual model achieves consistent improvements, up to 1.74 percentage points, on the OntoNotes 5.0 English dataset using 9 different synthetic parallel datasets. We found that our unsupervised module only learns limited cross-lingual coreference knowledge and there is no parallel corpus with aligned coreference chains. Thus, we further align the coreference chains in the ParCorFull corpus. Part 2: Neural machine translation (NMT) models have achieved a great success. However, pronouns are still challenging to disambiguate during translation. We add an additional mention attention module and mention prediction losses on the top of conventional NMT models to further extract contextual features for disambiguation. Our experimental results show that the APT scores of English pronouns and English ambiguous pronouns are improved from 60.1, 50.4 to 61.2, 52.2, respectively, in English-to-German translation.


Biao Zhang, University of Edinburgh
Cross-Lingual and Cross-Modality Modeling for Context-Aware Machine Translation

Document-level contextual modeling has achieved great success in machine translation in addressing discourse-related ambiguities. Often, these successes are for text-to-text scenarios and based on the availability of large-scale parallel documents. However, many language pairs are short of document-level data, and discourse-related translation issues are not limited to textual translation. What if we have no document data? Would context be helpful to speech-to-text translation? In this talk, I will present our efforts on cross-lingual and cross-modality modeling for context-aware machine translation. I will first show how contextual modeling capability is achievable for languages without document data via cross-lingual transfer. Then, I will present that context also benefits speech-to-text translation in various ways, including but not limited to improving pronoun and homophone translation.