Table of Contents
Fetching ...

Analyzing values about gendered language reform in LLMs' revisions

Jules Watson, Xi Wang, Raymond Liu, Suzanne Stevenson, Barend Beekhuizen

TL;DR

This work investigates how four instruction-tuned LLMs revise gendered role nouns and the justifications they provide for those revisions, testing alignment with feminist and trans-inclusive language reforms. By manipulating prompt explicitness, referent gender, and contextual gender associations within a sociolinguistic framework, the study reveals a broad pattern of neutralization paired with nuanced context effects. The authors deploy a large-scale prompt/control framework (14{,}229 instances across $527$ stimuli and $N=13{,}609$+ related data) and analyze both word choices and rationales, showing that context modulates revision behavior and that justifications reveal competing values around language reform. These findings have practical implications for value alignment in LLMs, suggesting that policy and prompts must consider social context and explicitness to influence model behavior and rationales toward inclusive language.

Abstract

Within the common LLM use case of text revision, we study LLMs' revision of gendered role nouns (e.g., outdoorsperson/woman/man) and their justifications of such revisions. We evaluate their alignment with feminist and trans-inclusive language reforms for English. Drawing on insight from sociolinguistics, we further assess if LLMs are sensitive to the same contextual effects in the application of such reforms as people are, finding broad evidence of such effects. We discuss implications for value alignment.

Analyzing values about gendered language reform in LLMs' revisions

TL;DR

This work investigates how four instruction-tuned LLMs revise gendered role nouns and the justifications they provide for those revisions, testing alignment with feminist and trans-inclusive language reforms. By manipulating prompt explicitness, referent gender, and contextual gender associations within a sociolinguistic framework, the study reveals a broad pattern of neutralization paired with nuanced context effects. The authors deploy a large-scale prompt/control framework (14{,}229 instances across stimuli and + related data) and analyze both word choices and rationales, showing that context modulates revision behavior and that justifications reveal competing values around language reform. These findings have practical implications for value alignment in LLMs, suggesting that policy and prompts must consider social context and explicitness to influence model behavior and rationales toward inclusive language.

Abstract

Within the common LLM use case of text revision, we study LLMs' revision of gendered role nouns (e.g., outdoorsperson/woman/man) and their justifications of such revisions. We evaluate their alignment with feminist and trans-inclusive language reforms for English. Drawing on insight from sociolinguistics, we further assess if LLMs are sensitive to the same contextual effects in the application of such reforms as people are, finding broad evidence of such effects. We discuss implications for value alignment.

Paper Structure

This paper contains 24 sections, 2 figures, 11 tables.

Figures (2)

  • Figure 1: Prompt setup and sample LLM responses.
  • Figure 2: Revision patterns. For each of the three starting role noun variants, the bars show which variant or alternative wording it was revised to, for each preamble and model. Each bar corresponds to a proportion of our 527 stimulus sentences.