Table of Contents
Fetching ...

Generating Gender Alternatives in Machine Translation

Sarthak Garg, Mozhdeh Gheini, Clara Emmanuel, Tatiana Likhomanenko, Qin Gao, Matthias Paulik

TL;DR

This work tackles generating all grammatically valid gendered translations in MT by introducing entity-level gender alternatives and aligning gender-sensitive phrases to ambiguous entities. It develops a semi-supervised framework that encodes gender structures and their source alignments in a single translation, enabling generation of all valid gender combinations without extra inference cost. The authors release G-Tag and G-Trans datasets across multiple language pairs, and demonstrate that data augmentation via fine-tuned MT models or LLMs significantly improves the quality and coverage of gender alternatives while reducing bias. The approach enables user-centric translation UIs and aids human translators, with potential extensions to non-binary and broader gender-inflected languages.

Abstract

Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.

Generating Gender Alternatives in Machine Translation

TL;DR

This work tackles generating all grammatically valid gendered translations in MT by introducing entity-level gender alternatives and aligning gender-sensitive phrases to ambiguous entities. It develops a semi-supervised framework that encodes gender structures and their source alignments in a single translation, enabling generation of all valid gender combinations without extra inference cost. The authors release G-Tag and G-Trans datasets across multiple language pairs, and demonstrate that data augmentation via fine-tuned MT models or LLMs significantly improves the quality and coverage of gender alternatives while reducing bias. The approach enables user-centric translation UIs and aids human translators, with potential extensions to non-binary and broader gender-inflected languages.

Abstract

Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.
Paper Structure (31 sections, 12 equations, 5 figures, 10 tables, 2 algorithms)

This paper contains 31 sections, 12 equations, 5 figures, 10 tables, 2 algorithms.

Figures (5)

  • Figure 1: Prompting LLMs using in-context examples to edit the reference translation $y_B$ into all-masculine and all-feminine gender assignments. Multiple in-context examples are used but we illustrate only one here for brevity.
  • Figure 2: Number of examples v.s. number of ambiguous entities in the test set.
  • Figure 3: Ablation on the number of in-context examples. We use the GPT's alternative recall on English–Spanish as an exemplar. Per this results, we use six in-context examples for prompting.
  • Figure 4: Prompting LLMs using in-context examples to generate translations with all-masculine and all-feminine gender assignments from scratch.
  • Figure 5: This figure shows an example of aligning the gender structure $(\text{\color{brown}El doctor}\text{\color{teal}La doctora})$. The model is fine-tuned to classify the source tokens as being aligned ($1$) or not-aligned ($0$) to this gender structure.