Table of Contents
Fetching ...

Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures

Chiara Manna, Hosein Mohebbi, Afra Alishahi, Frédéric Blain, Eva Vanmassenhove

Abstract

While Large Language Models achieve state-of-the-art results across a wide range of NLP tasks, they remain prone to systematic biases. Among these, gender bias is particularly salient in MT, due to systematic differences across languages in whether and how gender is marked. As a result, translation often requires disambiguating implicit source signals into explicit gender-marked forms. In this context, standard benchmarks may capture broad disparities but fail to reflect the full complexity of gender bias in modern MT. In this paper, we extend recent frameworks on bias evaluation by: (i) introducing a novel measure coined "Prior Bias", capturing a model's default gender assumptions, and (ii) applying the framework to decoder-only MT models. Our results show that, despite their scale and state-of-the-art status, decoder-only models do not generally outperform encoder-decoder architectures on gender-specific metrics; however, post-training (e.g., instruction tuning) not only improves contextual awareness but also reduces the masculine Prior Bias.

Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures

Abstract

While Large Language Models achieve state-of-the-art results across a wide range of NLP tasks, they remain prone to systematic biases. Among these, gender bias is particularly salient in MT, due to systematic differences across languages in whether and how gender is marked. As a result, translation often requires disambiguating implicit source signals into explicit gender-marked forms. In this context, standard benchmarks may capture broad disparities but fail to reflect the full complexity of gender bias in modern MT. In this paper, we extend recent frameworks on bias evaluation by: (i) introducing a novel measure coined "Prior Bias", capturing a model's default gender assumptions, and (ii) applying the framework to decoder-only MT models. Our results show that, despite their scale and state-of-the-art status, decoder-only models do not generally outperform encoder-decoder architectures on gender-specific metrics; however, post-training (e.g., instruction tuning) not only improves contextual awareness but also reduces the masculine Prior Bias.
Paper Structure (20 sections, 6 figures, 5 tables)

This paper contains 20 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: EN$\rightarrow$IT WinoMT examples showing outputs defaulting to la governante (fem.) and il contadino (masc.), despite the gender cue (he/she).
  • Figure 2: Prompt templates for the EN$\rightarrow$IT task. Plain for Llama2 and TowerBase, chat-style for TowerInstruct.
  • Figure 3: Example illustrating the neutralization process followed to construct the Neutral Set.
  • Figure 4: Minimal pair illustration. The pro-stereotypical case (top) assigns she to librarian, while the anti-stereotypical case (bottom) assigns he. Accurate EN$\rightarrow$IT translations must adapt grammatical gender: la bibliotecaria (f.) vs. il bibliotecario (m.).
  • Figure 5: Attention from the translated profession noun to the feminine cue across layers (y-axis) and heads (x-axis). Values represent the average attention weight assigned to the cue across the minimum number of accurate minimal pairs observed across models ($n=195$). We focus on intermediate layers (8--20), where attention is strongest; earlier and later layers exhibit near-zero values. Same color scale applied; darker shades $\rightarrow$ higher average attention.
  • ...and 1 more figures