Table of Contents
Fetching ...

Fine-grained Gender Control in Machine Translation with Large Language Models

Minwoo Lee, Hyukhun Koh, Minsung Kim, Kyomin Jung

TL;DR

This work tackles fine-grained gender control in machine translation by introducing Gender-of-Entity (GoE) prompting for large language models, enabling per-entity gender inflection control across multiple entities and languages. It formalizes the task, evaluates on four benchmarks (MuST-SHE, GATE, WinoMT, MT-GenEval Contextual), and demonstrates state-of-the-art gender accuracy for single-entity cases while revealing gender interference in multi-entity settings. The study also proposes I-GoE to improve end-to-end gender inference and introduces LLMs as Gender Evaluators (LGE) to provide a reference-free, model-agnostic assessment of gender inflection, achieving strong agreement with human judgments. Overall, GoE advances fine-grained controlled translation with notable practical implications for bias-aware MT, while LGE offers a robust evaluation framework that mitigates annotation-dependent limitations.

Abstract

In machine translation, the problem of ambiguously gendered input has been pointed out, where the gender of an entity is not available in the source sentence. To address this ambiguity issue, the task of controlled translation that takes the gender of the ambiguous entity as additional input have been proposed. However, most existing works have only considered a simplified setup of one target gender for input. In this paper, we tackle controlled translation in a more realistic setting of inputs with multiple entities and propose Gender-of-Entity (GoE) prompting method for LLMs. Our proposed method instructs the model with fine-grained entity-level gender information to translate with correct gender inflections. By utilizing four evaluation benchmarks, we investigate the controlled translation capability of LLMs in multiple dimensions and find that LLMs reach state-of-the-art performance in controlled translation. Furthermore, we discover an emergence of gender interference phenomenon when controlling the gender of multiple entities. Finally, we address the limitations of existing gender accuracy evaluation metrics and propose leveraging LLMs as an evaluator for gender inflection in machine translation.

Fine-grained Gender Control in Machine Translation with Large Language Models

TL;DR

This work tackles fine-grained gender control in machine translation by introducing Gender-of-Entity (GoE) prompting for large language models, enabling per-entity gender inflection control across multiple entities and languages. It formalizes the task, evaluates on four benchmarks (MuST-SHE, GATE, WinoMT, MT-GenEval Contextual), and demonstrates state-of-the-art gender accuracy for single-entity cases while revealing gender interference in multi-entity settings. The study also proposes I-GoE to improve end-to-end gender inference and introduces LLMs as Gender Evaluators (LGE) to provide a reference-free, model-agnostic assessment of gender inflection, achieving strong agreement with human judgments. Overall, GoE advances fine-grained controlled translation with notable practical implications for bias-aware MT, while LGE offers a robust evaluation framework that mitigates annotation-dependent limitations.

Abstract

In machine translation, the problem of ambiguously gendered input has been pointed out, where the gender of an entity is not available in the source sentence. To address this ambiguity issue, the task of controlled translation that takes the gender of the ambiguous entity as additional input have been proposed. However, most existing works have only considered a simplified setup of one target gender for input. In this paper, we tackle controlled translation in a more realistic setting of inputs with multiple entities and propose Gender-of-Entity (GoE) prompting method for LLMs. Our proposed method instructs the model with fine-grained entity-level gender information to translate with correct gender inflections. By utilizing four evaluation benchmarks, we investigate the controlled translation capability of LLMs in multiple dimensions and find that LLMs reach state-of-the-art performance in controlled translation. Furthermore, we discover an emergence of gender interference phenomenon when controlling the gender of multiple entities. Finally, we address the limitations of existing gender accuracy evaluation metrics and propose leveraging LLMs as an evaluator for gender inflection in machine translation.
Paper Structure (35 sections, 1 equation, 3 figures, 15 tables)

This paper contains 35 sections, 1 equation, 3 figures, 15 tables.

Figures (3)

  • Figure 1: Four gender control scenarios in machine translation investigated in our study.
  • Figure 2: Gender accuracy of GoE prompting on the GATE subset with two ambiguous entities (#Ent=2). Uniform denotes translation with both entities mapped to the same gender, and Mixed denotes translation with entities mapped to different genders.
  • Figure 3: Example of human annotation pages