A Multilingual, Culture-First Approach to Addressing Misgendering in LLM Applications
Sunayana Sitaram, Adrian de Wynter, Isobel McCrum, Qilong Gu, Si-Qing Chen
TL;DR
This paper tackles misgendering in multilingual LLM applications by developing language-specific guardrails through participatory design across 42 languages and testing them in a meeting-transcript summarization task. It employs a human-in-the-loop pipeline to generate and verify synthetic multilingual transcripts, then evaluates guardrails with both human judges and LLM evaluators using defined metrics for misgendering and quality. Results show substantial reductions in gender mistakes and assumptions without sacrificing output quality, while also revealing limitations in LLM evaluators, especially for low-resource languages. The study releases the guardrails and a 42-language synthetic dataset to enable broader research and promote culturally informed responsible AI across languages and contexts.
Abstract
Misgendering is the act of referring to someone by a gender that does not match their chosen identity. It marginalizes and undermines a person's sense of self, causing significant harm. English-based approaches have clear-cut approaches to avoiding misgendering, such as the use of the pronoun ``they''. However, other languages pose unique challenges due to both grammatical and cultural constructs. In this work we develop methodologies to assess and mitigate misgendering across 42 languages and dialects using a participatory-design approach to design effective and appropriate guardrails across all languages. We test these guardrails in a standard LLM-based application (meeting transcript summarization), where both the data generation and the annotation steps followed a human-in-the-loop approach. We find that the proposed guardrails are very effective in reducing misgendering rates across all languages in the summaries generated, and without incurring loss of quality. Our human-in-the-loop approach demonstrates a method to feasibly scale inclusive and responsible AI-based solutions across multiple languages and cultures. We release the guardrails and synthetic dataset encompassing 42 languages, along with human and LLM-judge evaluations, to encourage further research on this subject.
