Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

Sahil Kuchlous; Marvin Li; Jeffrey G. Wang

Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

Sahil Kuchlous, Marvin Li, Jeffrey G. Wang

TL;DR

It is demonstrated theoretically and empirically that an unbiased text embedding space for input prompts is a necessary condition for representationally balanced diffusion models, meaning the distribution of generated images satisfy diversity requirements with respect to protected attributes.

Abstract

With the growing adoption of Text-to-Image (TTI) systems, the social biases of these models have come under increased scrutiny. Herein we conduct a systematic investigation of one such source of bias for diffusion models: embedding spaces. First, because traditional classifier-based fairness definitions require true labels not present in generative modeling, we propose statistical group fairness criteria based on a model's internal representation of the world. Using these definitions, we demonstrate theoretically and empirically that an unbiased text embedding space for input prompts is a necessary condition for representationally balanced diffusion models, meaning the distribution of generated images satisfy diversity requirements with respect to protected attributes. Next, we investigate the impact of biased embeddings on evaluating the alignment between generated images and prompts, a process which is commonly used to assess diffusion models. We find that biased multimodal embeddings like CLIP can result in lower alignment scores for representationally balanced TTI models, thus rewarding unfair behavior. Finally, we develop a theoretical framework through which biases in alignment evaluation can be studied and propose bias mitigation methods. By specifically adapting the perspective of embedding spaces, we establish new fairness conditions for diffusion model development and evaluation.

Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

TL;DR

Abstract

Paper Structure (23 sections, 7 theorems, 26 equations, 4 figures, 4 tables)

This paper contains 23 sections, 7 theorems, 26 equations, 4 figures, 4 tables.

Introduction
Related Work
Preliminaries
Biased Embedding, Biased Generations
Bias in Alignment Auditing
Definitions of Fairness
Properties of Fair Embeddings
Auditing Alignment with Biased Embeddings
Empirical Case-Study: CLIP
Conclusion and Future Work
Technical Background on Diffusion Models
Details from Section \ref{['sec:bias_embed_biased_gen']}
Biased Embeddings Correlate with Representationally Imbalanced Generations
Additional Details of Diffusion-From-Scratch Training
Synthetic Data Generation.
...and 8 more sections

Key Result

Theorem 4.5

Assume WLOG we have an embedding for $b$ that is $\frac{\varepsilon}{\sqrt{T}L}$-close to $a_1$, i.e. $\|e_{\mathcal{P}}(b)-e_{\mathcal{P}}(a_1+b)\| \leq \frac{\varepsilon}{\sqrt{T}L}.$ Under Assumptions assump:lipschitz and assum:distinct, we have $\mathrm{TV}(p_{b},p_{a_1+b}) \leq \varepsilon$ and

Figures (4)

Figure 1: Each point represents a profession from the $\texttt{Professions}$ dataset luccioni2023stable. The x-value is the ratio of cosine similarities between original and gendered versions of the prompt, and the y-value is the proportion of images that are classified as men. Line of best fit is in red and $R$-squared is reported.
Figure 2: For each category in the six classes, we sampled three random images among the 6000 synthetic training images and display them here.
Figure 3: We illustrate eight random samples from each of the three classes generated from our trained diffusion model.
Figure 4: Bias in auditing methods. The x-axis represents the proportion of the images that are male and the y-axis represents the score.

Theorems & Definitions (16)

Definition 4.1
Definition 4.2
Theorem 4.5: Bias in embeddings implies bias in image generations
proof
Definition 5.1: Multiaccuracy
Theorem 5.2
proof
Theorem 5.3
proof
Theorem 5.4
...and 6 more

Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

TL;DR

Abstract

Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (16)