About me
News
- My student Marija Stepanovic defends her PhD thesis on Phonetic Vowel Representations for Cross-Lingual Automatic Speech Recognition on 17 December 2024. Public defence at ITU starting at 13:00.
- I’ll give a keynote at the 8th conference on Using Corpora in Contrastive and Translation Studies (UCCTS 2025) in Hildesheim on 8-10 September 2025.
- Recent talks:
- 30 May 2024, Keynote at symposium Lost in Transl:AI:tion: Implications of machine translation for communication and comprehension.
- 17 April 2024, Department of Computing Science, Umeå University.
- 18 December 2023, CMStatistics 2023: Invited talk in a session on Statistical Challenges in Model-Based Data Science.
- Recent papers:
- Ahmed Ruby, Christian Hardmeier and Sara Stymne. Investigating the Role of Prosody in Disambiguating Implicit Discourse Relations in Egyptian Arabic. Speech Prosody 2024.
- Paul Engelmann, Peter Brunsgaard Trolle and Christian Hardmeier. A Dataset for the Detection of Dehumanizing Language. LT-EDI 2024.
- Dennis Ulmer, Christian Hardmeier and Jes Frellsen. Prior and Posterior Networks: A Survey on Evidential Deep Learning Methods for Uncertainty Estimation. Transactions on Machine Learning Research 2023.
- I’ve joined the Danish Pioneer Centre for Artificial Intelligence as co-lead of the Speech and Language Collaboratory together with Isabelle Augenstein.
- Our paper on Experimental Standards in Deep Learning Research: A Natural Language Processing Perspective, written together with Dennis Ulmer, Elisa Bassignana, Max Müller-Eberstein, Daniel Varab, Mike Zhang and Barbara Plank, received an Outstanding Paper Award at the ICLR ML Evaluation Standards Workshop.
- I was on the organisation committee of the Fifth Workshop on Computational Approaches to Discourse (CODI) at EACL 2024 in Malta, 21/22 March 2024.
- I was one of the programme chairs of NODALIDA 2023, a senior area chair for Discourse and Pragmatics at ACL 2023, EMNLP 2023 and ACL 2024 and an area chair at COLING 2025. I will also be an area chair at IJCAI 2025.
- I served as faculty examiner (opponent) at Hannah Devinney’s PhD defence at Umeå University, 18 April 2024.
- On 27 September 2024, my student Dennis Ulmer defended his PhD thesis on uncertainty in natural language processing.
- On 13 June 2023, we organised an afternoon of Lectures on Language Technology and Society at ITU. All the recordings from the workshop are now available on YouTube.
- On 19 August 2022, we organised an on-line Workshop on Pronouns and Machine Translation with a series of exciting talks. All the recordings from the workshop are available on YouTube.
Bio
I’m an Associate Professor of Computer Science at the IT University of Copenhagen, where I’m a part of the NLPnorth natural language processing group. Until 2022, I was also a part of the Computational Linguistics Group at Uppsala University as a Researcher and Associate Professor (Docent) in Computational Linguistics. In 2019-2021, I spent two years as a visitor and Senior Researcher at the School of Informatics of the University of Edinburgh. From 2009 to 2011, I was a member of the machine translation group at Fondazione Bruno Kessler in Trento.
I hold a PhD in Computational Linguistics from Uppsala University. My PhD thesis was on Discourse in Statistical Machine Translation and received the Best Thesis Award of the European Association for Machine Translation in 2015. My supervisors were Joakim Nivre, Jörg Tiedemann and Marcello Federico. I also have an MA in Nordic Philology from the University of Basel.
Research
I work in computational linguistics, and my research touches on statistical natural language processing, machine learning, machine translation, translation studies and text linguistics. My goal is to create NLP systems with a higher awareness of linguistic context and non-linguistic aspects of communicative situations. I am also interested in studying high-level problems in translation using methods from statistical NLP and machine translation.
I’m a member of the organisation committee of the Workshop on Computational Approaches to Discourse (CODI) at EMNLP 2020, EMNLP 2021, COLING 2022, ACL 2023 and EACL 2024. I also co-organised the Workshops on Gender Bias in Natural Language Processing (GeBNLP) at ACL 2019, COLING 2020, ACL-IJCNLP 2021 and NAACL 2022 and the Workshop on Discourse in MT (DiscoMT), last held at EMNLP-IJCNLP 2019.
Topics I am currently interested in include the following:
Uncertainty quantification and communication
How can we measure how certain or confident large language models are of the output they produce (1, 2)? How can we ensure that their measured confidence is realistic and not overblown, and that the text they produce correctly reflects this confidence? How can large language models effectively convey their level of confidence to diverse user groups?
Toxicity and bias
I am keen on modelling in an explainable way what makes specific forms of toxic language toxic, for instance in the context of dehumanising language (3) and threats. I’m also a part of the SafeNet project, in which we study the response of social media platforms to reports of unsafe language across 19 European countries and have studied gender bias in NLP, particularly in the interpretation and generation of referring expressions (4, 5).
Reference
I’m particularly curious about how referring expressions such as pronouns (like she, it, they or this) and lexical noun phrases (like a cat, the house, scrambled eggs or my research) are used across languages, how human translators treat them, what machine translation systems should do with them and how we can use multilingual data to help us interpret them automatically.
These are problems I’ve studied for many years and have approached from many angles:
- Discourse-level MT (6) and its evaluation (7, 8, 9)
- Cross-lingual coreference resolution (10), cross-lingual pronoun prediction (11, 12) and neural language modelling (13)
- Automatic discovery of discourse-related language contrasts in human-translated parallel corpora (14, 15)
- Cross-lingual studies of the generation and interpretation of referring expressions with human subjects (16, 17, 18)
- Coreference annotation of multilingual corpora (19, 20)