Table of Contents
Fetching ...

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models

Evan Chen, Run-Jun Zhan, Yan-Bai Lin, Hung-Hsuan Chen

TL;DR

This study introduces a novel evaluation framework to uncover gender biases in LLMs: using free-form storytelling to surface biases embedded within the models.

Abstract

Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases. This study introduces a novel evaluation framework to uncover gender biases in LLMs: using free-form storytelling to surface biases embedded within the models. A systematic analysis of ten prominent LLMs shows a consistent pattern of overrepresenting female characters across occupations, likely due to supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Paradoxically, despite this overrepresentation, the occupational gender distributions produced by these LLMs align more closely with human stereotypes than with real-world labor data. This highlights the challenge and importance of implementing balanced mitigation measures to promote fairness and prevent the establishment of potentially new biases. We release the prompts and LLM-generated stories at GitHub.

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models

TL;DR

This study introduces a novel evaluation framework to uncover gender biases in LLMs: using free-form storytelling to surface biases embedded within the models.

Abstract

Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases. This study introduces a novel evaluation framework to uncover gender biases in LLMs: using free-form storytelling to surface biases embedded within the models. A systematic analysis of ten prominent LLMs shows a consistent pattern of overrepresenting female characters across occupations, likely due to supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Paradoxically, despite this overrepresentation, the occupational gender distributions produced by these LLMs align more closely with human stereotypes than with real-world labor data. This highlights the challenge and importance of implementing balanced mitigation measures to promote fairness and prevent the establishment of potentially new biases. We release the prompts and LLM-generated stories at GitHub.

Paper Structure

This paper contains 14 sections, 2 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Occupation Gender Heatmap. The occupations are ranked from the most female-oriented (left) to the most male-oriented (right) according to the GSR. Our analysis of 106 occupations revealed LLMs' significant gender skew. Female characters predominated ($\geq$80% of stories) in 35 occupations, whereas male characters did so in only 5 of 106.
  • Figure 2: The boxplot of the male ratio for all occupations generated by each LLM and two benchmarks. All LLMs exhibit a significant bias towards female representation, with median male proportions consistently below 20%.
  • Figure 3: Occupation Gender Heatmap of GPT-4o, GPT-4o-mini and GPT2-XL. With male protagonists appearing in 82.0% of its stories, GPT2-XL shows a pattern that is completely opposite to that of the SFT- and RLHF-tuned models.