Table of Contents
Fetching ...

Do Multilingual Large Language Models Mitigate Stereotype Bias?

Shangrui Nie, Michael Fromm, Charles Welch, Rebekka Görge, Akbar Karimi, Joan Plepi, Nazia Afsan Mowmita, Nicolas Flores-Herr, Mehdi Ali, Lucie Flek

TL;DR

The paper investigates whether multilingual pre-training reduces stereotype bias in decoder-based LLMs by training six 2.6B models (five monolingual and one multilingual) on equal-language data and evaluating bias using translated CrowS-Pairs and BBQ benchmarks with human validation. It demonstrates that multilingual training yields lower bias and often higher accuracy than monolingual training with the same data and architecture. The study integrates translation-quality control and cross-language evaluation to establish robust bias measurements, and compares against open-source baselines to contextualize performance. These findings support multilingual pre-training as an effective bias mitigation strategy and highlight practical implications for deploying fairer LLMs across multiple languages.

Abstract

While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanish) and one multilingual model trained on an equal distribution of data across these languages, all using publicly available data. To ensure robust evaluation, standard bias benchmarks were automatically translated into the five target languages and verified for both translation quality and bias preservation by human annotators. Our results consistently demonstrate that multilingual training effectively mitigates bias. Moreover, we observe that multilingual models achieve not only lower bias but also superior prediction accuracy when compared to monolingual models with the same amount of training data, model architecture, and size.

Do Multilingual Large Language Models Mitigate Stereotype Bias?

TL;DR

The paper investigates whether multilingual pre-training reduces stereotype bias in decoder-based LLMs by training six 2.6B models (five monolingual and one multilingual) on equal-language data and evaluating bias using translated CrowS-Pairs and BBQ benchmarks with human validation. It demonstrates that multilingual training yields lower bias and often higher accuracy than monolingual training with the same data and architecture. The study integrates translation-quality control and cross-language evaluation to establish robust bias measurements, and compares against open-source baselines to contextualize performance. These findings support multilingual pre-training as an effective bias mitigation strategy and highlight practical implications for deploying fairer LLMs across multiple languages.

Abstract

While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanish) and one multilingual model trained on an equal distribution of data across these languages, all using publicly available data. To ensure robust evaluation, standard bias benchmarks were automatically translated into the five target languages and verified for both translation quality and bias preservation by human annotators. Our results consistently demonstrate that multilingual training effectively mitigates bias. Moreover, we observe that multilingual models achieve not only lower bias but also superior prediction accuracy when compared to monolingual models with the same amount of training data, model architecture, and size.
Paper Structure (22 sections, 4 equations, 8 figures, 6 tables)

This paper contains 22 sections, 4 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: An example from the BBQ dataset parrish-etal-2022-bbq, where a multilingual model shows an unbiased behavior compared to a monolingual model.
  • Figure 2: Heat map of CrowSPairs bias percentage scores using our models and open-source models. A perfect score would be 0 which represents an equal probability of choosing either sentence. The microaverage is computed across all categories based on frequency. Our multilingual model has less bias than monolingual models and open-source LLMs (the likelihood assigned to the non-stereotyping sentence is higher).
  • Figure 3: Heat map of BBQ overall accuracy using our monolingual and multilingual models (left) as well as the open-source models (right). Our multilingual model is better than monolingual models in all languages and surpasses most of the open-source LLMs.
  • Figure 4: Heat map of BBQ accuracies for our monolingual and multilingual model. The left side shows accuracy for the ambiguous contexts, while the right shows accuracy for the disambiguated contexts. Our multilingual model has much higher accuracy in ambiguous contexts, but slightly lower for disambiguated contexts.
  • Figure 5: Heat map of BBQ biases using our monolingual and multilingual models. The left side shows bias for the ambiguous contexts, while the right shows bias scores for the disambiguous contexts.
  • ...and 3 more figures