Table of Contents
Fetching ...

A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models

Leander Girrbach, Stephan Alaniz, Genevieve Smith, Zeynep Akata

TL;DR

The study addresses how modern text-to-image models embed gender biases beyond occupations by analyzing millions of generated images from 3,217 gender-neutral prompts across five models. It introduces a scalable pipeline with prompt categories, BERTopic-based clustering, automated gender labeling on cropped person boxes, and rigorous filtering to yield reliable per-prompt gender estimates. Key findings show persistent male-default representations and female-restricted care/household roles, with bias amplification relative to the LAION-400M baseline across activities, contexts, objects, and occupations. The work highlights ethical concerns and practical implications for data curation and debiasing, and provides reproducible resources to foster fairer image-generation practices.

Abstract

With the increasing use of image generation technology, understanding its social biases, including gender bias, is essential. This paper presents a large-scale study on gender bias in text-to-image (T2I) models, focusing on everyday situations. While previous research has examined biases in occupations, we extend this analysis to gender associations in daily activities, objects, and contexts. We create a dataset of 3,217 gender-neutral prompts and generate 200 images over 5 prompt variations per prompt from five leading T2I models. We automatically detect the perceived gender of people in the generated images and filter out images with no person or multiple people of different genders, leaving 2,293,295 images. To enable a broad analysis of gender bias in T2I models, we group prompts into semantically similar concepts and calculate the proportion of male- and female-gendered images for each prompt. Our analysis shows that T2I models reinforce traditional gender roles and reflect common gender stereotypes in household roles. Women are predominantly portrayed in care and human-centered scenarios, and men in technical or physical labor scenarios.

A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models

TL;DR

The study addresses how modern text-to-image models embed gender biases beyond occupations by analyzing millions of generated images from 3,217 gender-neutral prompts across five models. It introduces a scalable pipeline with prompt categories, BERTopic-based clustering, automated gender labeling on cropped person boxes, and rigorous filtering to yield reliable per-prompt gender estimates. Key findings show persistent male-default representations and female-restricted care/household roles, with bias amplification relative to the LAION-400M baseline across activities, contexts, objects, and occupations. The work highlights ethical concerns and practical implications for data curation and debiasing, and provides reproducible resources to foster fairer image-generation practices.

Abstract

With the increasing use of image generation technology, understanding its social biases, including gender bias, is essential. This paper presents a large-scale study on gender bias in text-to-image (T2I) models, focusing on everyday situations. While previous research has examined biases in occupations, we extend this analysis to gender associations in daily activities, objects, and contexts. We create a dataset of 3,217 gender-neutral prompts and generate 200 images over 5 prompt variations per prompt from five leading T2I models. We automatically detect the perceived gender of people in the generated images and filter out images with no person or multiple people of different genders, leaving 2,293,295 images. To enable a broad analysis of gender bias in T2I models, we group prompts into semantically similar concepts and calculate the proportion of male- and female-gendered images for each prompt. Our analysis shows that T2I models reinforce traditional gender roles and reflect common gender stereotypes in household roles. Women are predominantly portrayed in care and human-centered scenarios, and men in technical or physical labor scenarios.

Paper Structure

This paper contains 42 sections, 3 equations, 19 figures, 11 tables.

Figures (19)

  • Figure 1: Overview of our experimental setup to analyze gender bias in T2I models regarding everyday scenarios. We show our 4 prompt groups on the bottom right and the five T2I models on the bottom left. The top left visualizes our filtering method: First, we detect people, crop the bounding boxes, and detect perceived gender. We remove images without people or showing at least one man and one woman. We calculate the proportions of female- and male-labeled images generated for each prompt and analyze systematic gender biases.
  • Figure 2: Stacked distribution of female ratios in generated images for all models and prompt groups.
  • Figure 3: (Top) Top 10 most female-dominated (top row) and top 10 most male-dominated (bottom row) activity clusters. Bars indicate ratio of female-gendered images generated from 5 T2I models averaged over prompts in each cluster. Error line indicates the std. dev. across prompts. (Bottom) Top 5 most female-dominated (left) and top 5 most male-dominated (right) context clusters.
  • Figure 4: (Top) Top 5 most female-dominated (left) and top 5 most male-dominated (right) object clusters. (Bottom) Top 5 most female-dominated (left) and top 5 most male-dominated (right) occupation clusters (note that here the y-axis is between 0% and 10%).
  • Figure 5: Household-related clusters of activity prompts.
  • ...and 14 more figures