Table of Contents
Fetching ...

Do language models practice what they preach? Examining language ideologies about gendered language reform encoded in LLMs

Julia Watson, Sophia Lee, Barend Beekhuizen, Suzanne Stevenson

Abstract

We study language ideologies in text produced by LLMs through a case study on English gendered language reform (related to role nouns like congressperson/-woman/-man, and singular they). First, we find political bias: when asked to use language that is "correct" or "natural", LLMs use language most similarly to when asked to align with conservative (vs. progressive) values. This shows how LLMs' metalinguistic preferences can implicitly communicate the language ideologies of a particular political group, even in seemingly non-political contexts. Second, we find LLMs exhibit internal inconsistency: LLMs use gender-neutral variants more often when more explicit metalinguistic context is provided. This shows how the language ideologies expressed in text produced by LLMs can vary, which may be unexpected to users. We discuss the broader implications of these findings for value alignment.

Do language models practice what they preach? Examining language ideologies about gendered language reform encoded in LLMs

Abstract

We study language ideologies in text produced by LLMs through a case study on English gendered language reform (related to role nouns like congressperson/-woman/-man, and singular they). First, we find political bias: when asked to use language that is "correct" or "natural", LLMs use language most similarly to when asked to align with conservative (vs. progressive) values. This shows how LLMs' metalinguistic preferences can implicitly communicate the language ideologies of a particular political group, even in seemingly non-political contexts. Second, we find LLMs exhibit internal inconsistency: LLMs use gender-neutral variants more often when more explicit metalinguistic context is provided. This shows how the language ideologies expressed in text produced by LLMs can vary, which may be unexpected to users. We discuss the broader implications of these findings for value alignment.
Paper Structure (36 sections, 1 equation, 40 figures, 7 tables)

This paper contains 36 sections, 1 equation, 40 figures, 7 tables.

Figures (40)

  • Figure 1: Example stimuli and illustrative outputs. (Darker indicates more probable.)
  • Figure 2: Exp 1 approach, illustrated for political groups (with application to stances in the same way).
  • Figure 3: Exp 1 results. Lines show political bias: Purple lines connecting prog(-stance) and meta indicate progressive bias; orange lines connecting cons(-stance) and meta indicate conservative bias; no line means no clear bias. $x$-axis scales differ to ensure these lines are visible. Tests are based on $N = 40 \text{ names} * 52 \text{ stimuli} = 2080$ data points for role nouns ($480$ for GPT models, with $12$ stimuli) and $N = 40 \text{ names} * 40 \text{ stimuli} = 1600$ data points for singular pronouns.
  • Figure 4: Exp 1 results for role nouns - reduced set. Lines show political bias: Purple lines connecting prog(-stance) and meta indicate progressive bias; orange lines connecting cons(-stance) and meta indicate conservative bias; no line means no clear bias. $x$-axis scales differ to ensure these lines are visible. Tests are based on $N = 40 \text{ names} * 12 \text{ stimuli} = 480$ data points.
  • Figure 5: Exp 1 results - text-curie-001
  • ...and 35 more figures