Table of Contents
Fetching ...

Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models

Abhishek Mandal, Susan Leavy, Suzanne Little

TL;DR

The study tackles gender bias in text-to-image diffusion by introducing internal bias metrics that separate diffusion-stage bias from bias amplification during prompt-to-image generation. Building on the Multimodal Composite Association Score (MCAS), the authors define Diffusion Bias ($\delta$) and Bias Amplification ($\alpha$) and apply them to DALL-E 2 and Stable Diffusion v2 to analyze internal bias dynamics. Experiments across occupation, sport, object, and scene prompts reveal higher bias in female-dominated categories, demonstrate diffusion-stage contributions to bias, and show Stable Diffusion v2 generally exhibits stronger bias than DALL-E 2. These internal metrics offer a pathway for architecture-aware debiasing and auditing of multistage, multimodal diffusion systems, while acknowledging binary-gender limitations and suggesting future work toward non-binary identities and broader model families.

Abstract

Text-To-Image (TTI) Diffusion Models such as DALL-E and Stable Diffusion are capable of generating images from text prompts. However, they have been shown to perpetuate gender stereotypes. These models process data internally in multiple stages and employ several constituent models, often trained separately. In this paper, we propose two novel metrics to measure bias internally in these multistage multimodal models. Diffusion Bias was developed to detect and measures bias introduced by the diffusion stage of the models. Bias Amplification measures amplification of bias during the text-to-image conversion process. Our experiments reveal that TTI models amplify gender bias, the diffusion process itself contributes to bias and that Stable Diffusion v2 is more prone to gender bias than DALL-E 2.

Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models

TL;DR

The study tackles gender bias in text-to-image diffusion by introducing internal bias metrics that separate diffusion-stage bias from bias amplification during prompt-to-image generation. Building on the Multimodal Composite Association Score (MCAS), the authors define Diffusion Bias () and Bias Amplification () and apply them to DALL-E 2 and Stable Diffusion v2 to analyze internal bias dynamics. Experiments across occupation, sport, object, and scene prompts reveal higher bias in female-dominated categories, demonstrate diffusion-stage contributions to bias, and show Stable Diffusion v2 generally exhibits stronger bias than DALL-E 2. These internal metrics offer a pathway for architecture-aware debiasing and auditing of multistage, multimodal diffusion systems, while acknowledging binary-gender limitations and suggesting future work toward non-binary identities and broader model families.

Abstract

Text-To-Image (TTI) Diffusion Models such as DALL-E and Stable Diffusion are capable of generating images from text prompts. However, they have been shown to perpetuate gender stereotypes. These models process data internally in multiple stages and employ several constituent models, often trained separately. In this paper, we propose two novel metrics to measure bias internally in these multistage multimodal models. Diffusion Bias was developed to detect and measures bias introduced by the diffusion stage of the models. Bias Amplification measures amplification of bias during the text-to-image conversion process. Our experiments reveal that TTI models amplify gender bias, the diffusion process itself contributes to bias and that Stable Diffusion v2 is more prone to gender bias than DALL-E 2.

Paper Structure

This paper contains 16 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Association Scores in Diffusion Models. A generalised diagram showing the working of diffusion models like DALL-E 2 and Stable Diffusion. The embeddings are generated using an external CLIP model. Source: Mandal et al. mandal2023measuring.
  • Figure 2: Diffusion Bias ($\delta$) vs Bias Amplification ($\alpha$).
  • Figure 3: DALL-E 2 Diffusion Bias ($\delta$) vs Bias Amplification ($\alpha$) polynomial regression with LOESS smoothing.
  • Figure 4: Stable Diffusion Diffusion v2 Bias ($\delta$) vs Bias Amplification ($\alpha$) polynomial regression with LOESS smoothing.