Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks

Fatemeh Mohammadi; Marta Annamaria Tamborini; Paolo Ceravolo; Costanza Nardocci; Samira Maghool

Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks

Fatemeh Mohammadi, Marta Annamaria Tamborini, Paolo Ceravolo, Costanza Nardocci, Samira Maghool

TL;DR

The paper tackles gender bias in English-to-Italian machine translation by framing bias within legal and linguistic notions and addressing notional vs grammatical gender differences in Italian. It introduces a similarity-network approach using multilingual FastText embeddings and compares against Google Translate to quantify both the intensity and direction of bias for a broad word set. Two numerical features, GenderBiasIntensity and GenderBiasDirection, along with PostTranslationSimilarityChanges, enable fine-grained assessment of bias before and after translation, revealing embeddings as the primary source of bias and showing directional shifts after translation. The findings offer actionable guidance for developing gender-neutral MT, highlighting the need to debias training corpora and embeddings, and pointing to future work on alternative embeddings, multiple MT models, and expansion to other bias categories.

Abstract

This paper is a collaborative effort between Linguistics, Law, and Computer Science to evaluate stereotypes and biases in automated translation systems. We advocate gender-neutral translation as a means to promote gender inclusion and improve the objectivity of machine translation. Our approach focuses on identifying gender bias in English-to-Italian translations. First, we define gender bias following human rights law and linguistics literature. Then we proceed by identifying gender-specific terms such as she/lei and he/lui as key elements. We then evaluate the cosine similarity between these target terms and others in the dataset to reveal the model's perception of semantic relations. Using numerical features, we effectively evaluate the intensity and direction of the bias. Our findings provide tangible insights for developing and training gender-neutral translation algorithms.

Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks

TL;DR

Abstract

Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)