Table of Contents
Fetching ...

Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation

Ian Stewart, Rada Mihalcea

TL;DR

This paper investigates bias against same-gender relationships in machine translation by constructing a dataset of relationship templates in noun-gender languages (Spanish, French, Italian) and evaluating three major MT services. The authors show widespread bias: same-gender relationship translations are less accurate than different-gender ones, with the strongest effects observed for high-income and highly represented female occupations. They employ McNemar’s test to establish significance and use a logistic regression to link bias to social correlates, such as income, female representation, and age, across languages and models. The findings reveal that bias in MT can distort social relationships and reinforce normative stereotypes, underscoring the need for broader, relationship-focused bias testing and mitigation in NLP systems with real-world impact on LGBTQ communities.

Abstract

Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output. While bias in gender norms has been investigated, less is known about whether MT systems encode bias about social relationships, e.g., "the lawyer kissed her wife." We investigate the degree of bias against same-gender relationships in MT systems, using generated template sentences drawn from several noun-gender languages (e.g., Spanish) and comprised of popular occupation nouns. We find that three popular MT services consistently fail to accurately translate sentences concerning relationships between entities of the same gender. The error rate varies considerably based on the context, and same-gender sentences referencing high female-representation occupations are translated with lower accuracy. We provide this work as a case study in the evaluation of intrinsic bias in NLP systems with respect to social relationships.

Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation

TL;DR

This paper investigates bias against same-gender relationships in machine translation by constructing a dataset of relationship templates in noun-gender languages (Spanish, French, Italian) and evaluating three major MT services. The authors show widespread bias: same-gender relationship translations are less accurate than different-gender ones, with the strongest effects observed for high-income and highly represented female occupations. They employ McNemar’s test to establish significance and use a logistic regression to link bias to social correlates, such as income, female representation, and age, across languages and models. The findings reveal that bias in MT can distort social relationships and reinforce normative stereotypes, underscoring the need for broader, relationship-focused bias testing and mitigation in NLP systems with real-world impact on LGBTQ communities.

Abstract

Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output. While bias in gender norms has been investigated, less is known about whether MT systems encode bias about social relationships, e.g., "the lawyer kissed her wife." We investigate the degree of bias against same-gender relationships in MT systems, using generated template sentences drawn from several noun-gender languages (e.g., Spanish) and comprised of popular occupation nouns. We find that three popular MT services consistently fail to accurately translate sentences concerning relationships between entities of the same gender. The error rate varies considerably based on the context, and same-gender sentences referencing high female-representation occupations are translated with lower accuracy. We provide this work as a case study in the evaluation of intrinsic bias in NLP systems with respect to social relationships.
Paper Structure (11 sections, 1 equation, 2 figures, 5 tables)

This paper contains 11 sections, 1 equation, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Example translation error of same-gender sentence between English and Spanish (Google Translate; accessed 1 November 2023).
  • Figure 2: Translation accuracy for relationship sentences, grouped by relationship type (same-gender vs. different-gender).