PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

Shufan Li; Harkanwar Singh; Aditya Grover

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

Shufan Li, Harkanwar Singh, Aditya Grover

TL;DR

PopAlign addresses population-level biases in text-to-image diffusion models by extending alignment from pairwise, sample-level preferences to population-level signals. It collects multi-sample preferences via identity-augmented prompts, formulates a Bradley-Terry-based objective, and derives a tractable diffusion-specific lower bound to optimize the model without retraining on large datasets. Empirical results on SDXL show substantial reductions in gender and race biases while preserving generation quality across identity-neutral, identity-specific, and generic prompts. The approach demonstrates practical potential for safer T2I deployments, though it relies on classifier reliability and acknowledges limitations regarding non-binary identities and broader bias coverage.

Abstract

Text-to-image (T2I) models achieve high-fidelity generation through extensive training on large datasets. However, these models may unintentionally pick up undesirable biases of their training data, such as over-representation of particular identities in gender or ethnicity neutral prompts. Existing alignment methods such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) fail to address this problem effectively because they operate on pairwise preferences consisting of individual samples, while the aforementioned biases can only be measured at a population level. For example, a single sample for the prompt "doctor" could be male or female, but a model generating predominantly male doctors even with repeated sampling reflects a gender bias. To address this limitation, we introduce PopAlign, a novel approach for population-level preference optimization, while standard optimization would prefer entire sets of samples over others. We further derive a stochastic lower bound that directly optimizes for individual samples from preferred populations over others for scalable training. Using human evaluation and standard image quality and bias metrics, we show that PopAlign significantly mitigates the bias of pretrained T2I models while largely preserving the generation quality. Code is available at https://github.com/jacklishufan/PopAlignSDXL.

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

TL;DR

Abstract

Paper Structure (36 sections, 15 equations, 12 figures, 4 tables)

This paper contains 36 sections, 15 equations, 12 figures, 4 tables.

Introduction
Related Works
Diversity and fairness in image generation
Aligning generative models with human preferences
Background
Reinforcement Learning with Human Feedback
Direct Preference Optimization
Diffusion models
Diffusion-DPO
Method
Population-Level Preference Acquisition
Population-Level Alignment from Human Preferences
Population Level Alignment of Text-to-Image Diffusion Models
Synthetic Evaluation
Experiments
...and 21 more sections

Figures (12)

Figure 1: Illustration of PopAlign, our proposed framework for mitigating the bias of pretrained T2I models using population-level alignment. Left: SDXL over-represents a particular identity as it picked up biases of the training data. Right: PopAlign mitigates the biases without compromising the quality of generated samples.
Figure 2: Difference between PopAlign and existing RLHF/DPO Methods. Left: Existing methods such as RLHF/DPO use pairwise preferences of individual samples to improve image quality. Right PopAlign uses population-level preferences to achieve better fairness and diversity.
Figure 4: Qualitative results on gender-neutral prompts. PopAlign mitigates the bias of the pretrained SDXL in both male-skewed or female-skewed prompts.
Figure 5: Human Evaluation on fairness and quality of the image population
Figure 6: Ablation study of varying CFGs
...and 7 more figures

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

TL;DR

Abstract

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (12)