Table of Contents
Fetching ...

On Evaluating and Mitigating Gender Biases in Multilingual Settings

Aniket Vashishtha, Kabir Ahuja, Sunayana Sitaram

TL;DR

This work tackles gender bias in multilingual language technologies by building a culturally informed bias benchmark (Multilingual DisCo) for six Indian languages and evaluating cross-language transfer of debiasing. It extends debiasing methods to non-English contexts via Counterfactual Data Augmentation and Self-Debiasing, and uses a multilingual Bias Evaluation (MBE) to assess bias across high-resource languages. Key findings show multilingual CDA substantially reduces bias, while Self-Debiasing often increases bias in multilingual settings, underscoring the limits of English-centric debiasing for non-English languages. The paper also provides datasets and code to support scalable, inclusive bias mitigation across languages.

Abstract

While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English. In this work, we investigate some of the challenges with evaluating and mitigating biases in multilingual settings which stem from a lack of existing benchmarks and resources for bias evaluation beyond English especially for non-western context. In this paper, we first create a benchmark for evaluating gender biases in pre-trained masked language models by extending DisCo to different Indian languages using human annotations. We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models on our proposed metric. Overall, our work highlights the challenges that arise while studying social biases in multilingual settings and provides resources as well as mitigation techniques to take a step toward scaling to more languages.

On Evaluating and Mitigating Gender Biases in Multilingual Settings

TL;DR

This work tackles gender bias in multilingual language technologies by building a culturally informed bias benchmark (Multilingual DisCo) for six Indian languages and evaluating cross-language transfer of debiasing. It extends debiasing methods to non-English contexts via Counterfactual Data Augmentation and Self-Debiasing, and uses a multilingual Bias Evaluation (MBE) to assess bias across high-resource languages. Key findings show multilingual CDA substantially reduces bias, while Self-Debiasing often increases bias in multilingual settings, underscoring the limits of English-centric debiasing for non-English languages. The paper also provides datasets and code to support scalable, inclusive bias mitigation across languages.

Abstract

While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English. In this work, we investigate some of the challenges with evaluating and mitigating biases in multilingual settings which stem from a lack of existing benchmarks and resources for bias evaluation beyond English especially for non-western context. In this paper, we first create a benchmark for evaluating gender biases in pre-trained masked language models by extending DisCo to different Indian languages using human annotations. We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models on our proposed metric. Overall, our work highlights the challenges that arise while studying social biases in multilingual settings and provides resources as well as mitigation techniques to take a step toward scaling to more languages.
Paper Structure (14 sections, 2 figures, 4 tables)

This paper contains 14 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Example template translation for "{PERSON} likes to {BLANK}" in Hindi for creation of our multilingual dataset.
  • Figure 2: MBE scores for monolingual and multilingual models and the impact of debiasing across languages