Table of Contents
Fetching ...

Beyond the Prompt: Gender Bias in Text-to-Image Models, with a Case Study on Hospital Professions

Franck Vandewiele, Remi Synave, Samuel Delepoulle, Remi Cozot

TL;DR

This study examines gender bias in six open-weight text-to-image models by generating images of five hospital professions under varied portrait qualifiers. Using a unified prompting framework and manual gender annotation, the authors show consistent stereotypes—nurses as female and surgeons as male—while revealing model-specific differences in prompt sensitivity and bias strength. The findings highlight that prompt wording (e.g., corporate vs. beautiful qualifiers) can modulate gender balance, underscoring the need for bias-aware design, balanced defaults, and user guidance in generative AI. The work calls for broader mitigation strategies and extension to intersectional dimensions to ensure fair, diverse representations in professional imagery.

Abstract

Text-to-image (TTI) models are increasingly used in professional, educational, and creative contexts, yet their outputs often embed and amplify social biases. This paper investigates gender representation in six state-of-the-art open-weight models: HunyuanImage 2.1, HiDream-I1-dev, Qwen-Image, FLUX.1-dev, Stable-Diffusion 3.5 Large, and Stable-Diffusion-XL. Using carefully designed prompts, we generated 100 images for each combination of five hospital-related professions (cardiologist, hospital director, nurse, paramedic, surgeon) and five portrait qualifiers ("", corporate, neutral, aesthetic, beautiful). Our analysis reveals systematic occupational stereotypes: all models produced nurses exclusively as women and surgeons predominantly as men. However, differences emerge across models: Qwen-Image and SDXL enforce rigid male dominance, HiDream-I1-dev shows mixed outcomes, and FLUX.1-dev skews female in most roles. HunyuanImage 2.1 and Stable-Diffusion 3.5 Large also reproduce gender stereotypes but with varying degrees of sensitivity to prompt formulation. Portrait qualifiers further modulate gender balance, with terms like corporate reinforcing male depictions and beautiful favoring female ones. Sensitivity varies widely: Qwen-Image remains nearly unaffected, while FLUX.1-dev, SDXL, and SD3.5 show strong prompt dependence. These findings demonstrate that gender bias in TTI models is both systematic and model-specific. Beyond documenting disparities, we argue that prompt wording plays a critical role in shaping demographic outcomes. The results underscore the need for bias-aware design, balanced defaults, and user guidance to prevent the reinforcement of occupational stereotypes in generative AI.

Beyond the Prompt: Gender Bias in Text-to-Image Models, with a Case Study on Hospital Professions

TL;DR

This study examines gender bias in six open-weight text-to-image models by generating images of five hospital professions under varied portrait qualifiers. Using a unified prompting framework and manual gender annotation, the authors show consistent stereotypes—nurses as female and surgeons as male—while revealing model-specific differences in prompt sensitivity and bias strength. The findings highlight that prompt wording (e.g., corporate vs. beautiful qualifiers) can modulate gender balance, underscoring the need for bias-aware design, balanced defaults, and user guidance in generative AI. The work calls for broader mitigation strategies and extension to intersectional dimensions to ensure fair, diverse representations in professional imagery.

Abstract

Text-to-image (TTI) models are increasingly used in professional, educational, and creative contexts, yet their outputs often embed and amplify social biases. This paper investigates gender representation in six state-of-the-art open-weight models: HunyuanImage 2.1, HiDream-I1-dev, Qwen-Image, FLUX.1-dev, Stable-Diffusion 3.5 Large, and Stable-Diffusion-XL. Using carefully designed prompts, we generated 100 images for each combination of five hospital-related professions (cardiologist, hospital director, nurse, paramedic, surgeon) and five portrait qualifiers ("", corporate, neutral, aesthetic, beautiful). Our analysis reveals systematic occupational stereotypes: all models produced nurses exclusively as women and surgeons predominantly as men. However, differences emerge across models: Qwen-Image and SDXL enforce rigid male dominance, HiDream-I1-dev shows mixed outcomes, and FLUX.1-dev skews female in most roles. HunyuanImage 2.1 and Stable-Diffusion 3.5 Large also reproduce gender stereotypes but with varying degrees of sensitivity to prompt formulation. Portrait qualifiers further modulate gender balance, with terms like corporate reinforcing male depictions and beautiful favoring female ones. Sensitivity varies widely: Qwen-Image remains nearly unaffected, while FLUX.1-dev, SDXL, and SD3.5 show strong prompt dependence. These findings demonstrate that gender bias in TTI models is both systematic and model-specific. Beyond documenting disparities, we argue that prompt wording plays a critical role in shaping demographic outcomes. The results underscore the need for bias-aware design, balanced defaults, and user guidance to prevent the reinforcement of occupational stereotypes in generative AI.

Paper Structure

This paper contains 21 sections, 18 figures, 17 tables.

Figures (18)

  • Figure 1: Using a simple prompt such as "a surgeon" may result in the generation of an image depicting a group of people, as illustrated in the examples. To avoid this ambiguity and ensure a single individual is represented, we use the more specific prompt: "a portrait of a surgeon".
  • Figure 2: Using the image qualifier "high quality, detailed and ultra realistic photography, 4K, HDR" helps avoid cartoon-like renderings.
  • Figure 3: Images generated by the six models (from left to right: HUNYUAN, HIDREAM, QWEN, FLUX, SD3.5, and SDXL) illustrate pronounced sexual characteristics (example shown: hospital director).
  • Figure 4: Cases where male/female classification was not possible are mainly images depicting two individuals (left: HIDREAM hospital director, center left: SDXL paramedic) or images without any person (center right: HUNYUAN paramedic, right: SDXL hospital director).
  • Figure 5: HUNYUAN: Representative outputs for each hospital profession without portrait qualifier.
  • ...and 13 more figures