Language-guided Detection and Mitigation of Unknown Dataset Bias

Zaiying Zhao; Soichiro Kumano; Toshihiko Yamasaki

Language-guided Detection and Mitigation of Unknown Dataset Bias

Zaiying Zhao, Soichiro Kumano, Toshihiko Yamasaki

TL;DR

The paper tackles unknown dataset biases that impair classifier performance on minority groups by proposing a language-guided framework that first detects biases as keywords from captions generated by vision-language models using GPT-4, and then mitigates bias through two methods: Language-guided Group-DRO (pseudo-labels for bias attributes enabling Group-DRO) and Language-guided Diffusion-based Augmentation (generating minority-group images with Stable Diffusion using bias keywords). Across CMNIST, Waterbirds, and CelebA, the approach outperforms state-of-the-art methods that do not assume prior bias knowledge and is competitive with methods that do rely on known biases. The framework enhances interpretability by presenting biases as textual keywords and demonstrates robustness across backbones and tasks, highlighting practical potential for real-world bias mitigation. Overall, the combination of accurate bias keyword extraction and two complementary debiasing pathways yields strong performance while maintaining interpretability in unknown-bias settings.

Abstract

Dataset bias is a significant problem in training fair classifiers. When attributes unrelated to classification exhibit strong biases towards certain classes, classifiers trained on such dataset may overfit to these bias attributes, substantially reducing the accuracy for minority groups. Mitigation techniques can be categorized according to the availability of bias information (\ie, prior knowledge). Although scenarios with unknown biases are better suited for real-world settings, previous work in this field often suffers from a lack of interpretability regarding biases and lower performance. In this study, we propose a framework to identify potential biases as keywords without prior knowledge based on the partial occurrence in the captions. We further propose two debiasing methods: (a) handing over to an existing debiasing approach which requires prior knowledge by assigning pseudo-labels, and (b) employing data augmentation via text-to-image generative models, using acquired bias keywords as prompts. Despite its simplicity, experimental results show that our framework not only outperforms existing methods without prior knowledge, but also is even comparable with a method that assumes prior knowledge.

Language-guided Detection and Mitigation of Unknown Dataset Bias

TL;DR

Abstract

Paper Structure (21 sections, 2 equations, 7 figures, 13 tables, 1 algorithm)

This paper contains 21 sections, 2 equations, 7 figures, 13 tables, 1 algorithm.

Introduction
Related work
Method
Problem definition
Language-guided bias detection
Bias mitigation leveraging extracted bias keywords
Experiments
Experimental detail
Main results
Additional experiment
Framework analysis
Extraction of bias keywords
Visualization of generated images
Conclusion
Dataset details
...and 6 more sections

Figures (7)

Figure 1: Overview of our framework. First, we perform Language-guided Bias Detection, which identifies bias keywords using GPT-4 from captions generated by VLMs (e.g., BLIP). With these extracted keywords, we propose (i) Language-guided Group-DRO, which adapts Group-DRO by leveraging the extracted bias keywords as pseudo-bias labels, and (ii) Language-guided Diffusion-based Augmentation, which generates images for the minority groups leveraging the bias keywords as input prompts.
Figure 2: samples with bias-aligned colors and extracted bias keywords in CMNIST dataset.
Figure 3: Visualization of generated images by our Language-guided Diffusion-based Augmentation. The generated images are consistent with minority groups (e.g., landbird on water, blonde man), and effective enough for resolving data imbalance.
Figure 4: Samples from CMNIST training and test data.
Figure 5: Additional visualization of generated images on CMNIST dataset. We generate images of digit with bias-conflict colors for each class (e.g., bias-aligned color for class zero is red, thus generating digit zero with colors other than red).
...and 2 more figures

Language-guided Detection and Mitigation of Unknown Dataset Bias

TL;DR

Abstract

Language-guided Detection and Mitigation of Unknown Dataset Bias

Authors

TL;DR

Abstract

Table of Contents

Figures (7)