This is a list of projects that can serve as a basis for student theses and similar coursework. If you are a student at the IT University of Copenhagen and are interested in working on one of these topics for the degree you’re pursuing, please feel free to get in touch with me.
Cross-lingual alignment of referring expressions
Texts contain referring expressions that point to things in the real world. Given parallel text in two languages, find ways to match referring expressions having the same referents and the same functions in the text. This project is ideal for a student with a background or interest in linguistics in addition to good knowledge of NLP and machine learning.
Visualisation of complex linguistic structures across languages
When you work with texts in multiple languages that are marked up for complex linguistic structures, it quickly becomes very difficult to see what’s going on across languages. This project aims to develop a user interface to work efficiently with this type of multidimensional data. It would be a good match for a student with an interest in and experience of user interface development.
The following projects are collaborations with an industry partner who will provide data and additional supervision (note: no paid work opportunities). They are standard NLP tasks that should be suitable for any student with a good foundation in data science or computer science.
- Email classification in a technical domain.
- Matching article numbers for identical spare parts across multiple equipment catalogues by clustering textual descriptions.
- Creating a chatbot to help non-technical stakeholders interact with software solutions in use at the company.