Analyzing values about gendered language reform in LLMs' revisions
Jules Watson, Xi Wang, Raymond Liu, Suzanne Stevenson, Barend Beekhuizen
TL;DR
This work investigates how four instruction-tuned LLMs revise gendered role nouns and the justifications they provide for those revisions, testing alignment with feminist and trans-inclusive language reforms. By manipulating prompt explicitness, referent gender, and contextual gender associations within a sociolinguistic framework, the study reveals a broad pattern of neutralization paired with nuanced context effects. The authors deploy a large-scale prompt/control framework (14{,}229 instances across $527$ stimuli and $N=13{,}609$+ related data) and analyze both word choices and rationales, showing that context modulates revision behavior and that justifications reveal competing values around language reform. These findings have practical implications for value alignment in LLMs, suggesting that policy and prompts must consider social context and explicitness to influence model behavior and rationales toward inclusive language.
Abstract
Within the common LLM use case of text revision, we study LLMs' revision of gendered role nouns (e.g., outdoorsperson/woman/man) and their justifications of such revisions. We evaluate their alignment with feminist and trans-inclusive language reforms for English. Drawing on insight from sociolinguistics, we further assess if LLMs are sensitive to the same contextual effects in the application of such reforms as people are, finding broad evidence of such effects. We discuss implications for value alignment.
