
Machine translation systems (MT) have become an indispensable part of our daily lives – they translate texts, subtitles, and even spoken language in real time. While they help us overcome language barriers, critical questions arise: Do these systems perpetuate harmful stereotypes? Do they reinforce biases instead of reducing them? And consequently: How can we even measure this? In this interdisciplinary research project, we are tackling these very questions! Our goal is to develop an improved Gender Bias Evaluation Testset (GBET) that systematically evaluates gender bias in machine translations. We build upon an existing method and expand it through new data and approaches. We examine various translation models - from Google Translate to DeepL to large language models like ChatGPT - and evaluate the extent to which they exhibit gender bias in different ways.
- Trainer/in: Tobias Jettkowski
- Trainer/in: Michelle Kappl