Table of Contents
Fetching ...

Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models

Leander Girrbach, Stephan Alaniz, Genevieve Smith, Trevor Darrell, Zeynep Akata

TL;DR

This paper provides the first large-scale, person-centric audit of the LAION-400M dataset by annotating bounding boxes, perceived gender and race/ethnicity, and per-person captions. The authors demonstrate demographic imbalances and harmful associations in the data and show that 60-70% of gender bias observed in CLIP and Stable Diffusion can be explained by direct co-occurrences in the training data. They introduce a robust automatic labeling pipeline, validate high labeling accuracy, and perform SAE-based analyses to uncover identity-linked topics. The work establishes a principled link between pretraining data composition and downstream model bias and offers a foundation for dataset rebalancing and fairer AI systems, while acknowledging ethical and methodological limitations of perceived demographics.

Abstract

Vision-language models trained on large-scale multimodal datasets show strong demographic biases, but the role of training data in producing these biases remains unclear. A major barrier has been the lack of demographic annotations in web-scale datasets such as LAION-400M. We address this gap by creating person-centric annotations for the full dataset, including over 276 million bounding boxes, perceived gender and race/ethnicity labels, and automatically generated captions. These annotations are produced through validated automatic labeling pipelines combining object detection, multimodal captioning, and finetuned classifiers. Using them, we uncover demographic imbalances and harmful associations, such as the disproportionate linking of men and individuals perceived as Black or Middle Eastern with crime-related and negative content. We also show that 60-70% of gender bias in CLIP and Stable Diffusion can be linearly explained by direct co-occurrences in the data. Our resources establish the first large-scale empirical link between dataset composition and downstream model bias.

Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models

TL;DR

This paper provides the first large-scale, person-centric audit of the LAION-400M dataset by annotating bounding boxes, perceived gender and race/ethnicity, and per-person captions. The authors demonstrate demographic imbalances and harmful associations in the data and show that 60-70% of gender bias observed in CLIP and Stable Diffusion can be explained by direct co-occurrences in the training data. They introduce a robust automatic labeling pipeline, validate high labeling accuracy, and perform SAE-based analyses to uncover identity-linked topics. The work establishes a principled link between pretraining data composition and downstream model bias and offers a foundation for dataset rebalancing and fairer AI systems, while acknowledging ethical and methodological limitations of perceived demographics.

Abstract

Vision-language models trained on large-scale multimodal datasets show strong demographic biases, but the role of training data in producing these biases remains unclear. A major barrier has been the lack of demographic annotations in web-scale datasets such as LAION-400M. We address this gap by creating person-centric annotations for the full dataset, including over 276 million bounding boxes, perceived gender and race/ethnicity labels, and automatically generated captions. These annotations are produced through validated automatic labeling pipelines combining object detection, multimodal captioning, and finetuned classifiers. Using them, we uncover demographic imbalances and harmful associations, such as the disproportionate linking of men and individuals perceived as Black or Middle Eastern with crime-related and negative content. We also show that 60-70% of gender bias in CLIP and Stable Diffusion can be linearly explained by direct co-occurrences in the data. Our resources establish the first large-scale empirical link between dataset composition and downstream model bias.

Paper Structure

This paper contains 29 sections, 21 equations, 20 figures, 12 tables.

Figures (20)

  • Figure 1: Workflow for annotating gender and race/ethnicity in LAION-400M. We detect $\sim$200M person bounding boxes with YOLO11. An MLLM ensemble (Phi-3.5-Vision, LLaVA-NeXT, InternVL3) provides gender and race/ethnicity labels on sampled subsets, and only consensus predictions are used to train SigLIP classifiers. These classifiers then label the full dataset, while InternVL3 generates person-centric captions. The resulting annotations enable systematic analysis of dataset composition, harmful correlations, and bias transfer to downstream models.
  • Figure 2: Left: Log-scaled histogram of bounding box areas, expressed as a percentage of the total image area. Right: The distribution of the number of detected people per image.
  • Figure 3: Example person-centric annotations in LAION-400M. Bounding boxes are in red and contain the gender and race/ethnicity labels. Captions by InternVL (one per image) are below.
  • Figure 4: Distribution of perceived gender (left) and perceived race/ethnicity labels (right) in 199,931,986 bounding boxes and 107,545,236 unique images.
  • Figure 5: Distribution of the relative change ($\Delta$) in representation for gender (left) and race/ethnicity (right) groups when associated with 63 crime words. $\Delta$ is calculated relative to the baseline distribution of each group. Positive values indicate a stronger association than expected.
  • ...and 15 more figures