Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
Massimiliano Ciranni, Vito Paolo Pastore, Roberto Di Via, Enzo Tartaglione, Francesca Odone, Vittorio Murino
TL;DR
Diffusing DeBias (DDB) tackles the challenge of spurious biases in image classification by leveraging conditional diffusion models to synthesize bias-aligned data per class. A Bias Amplifier trained on this synthetic data provides reliable supervisory signals, which are then integrated into two debiasing recipes (two-step and end-to-end) to produce robust debiased classifiers. Across six biased datasets, DDB achieves state-of-the-art unsupervised debiasing, demonstrating strong generalization and resilience when biases are absent. The approach offers a versatile plug-in for existing debiasing methods, albeit with high diffusion-model training costs, and shows promise for scalable, bias-aware learning in real-world settings.
Abstract
Deep learning model effectiveness in classification tasks is often challenged by the quality and quantity of training data whenever they are affected by strong spurious correlations between specific attributes and target labels. This results in a form of bias affecting training data, which typically leads to unrecoverable weak generalization in prediction. This paper aims at facing this problem by leveraging bias amplification with generated synthetic data: we introduce Diffusing DeBias (DDB), a novel approach acting as a plug-in for common methods of unsupervised model debiasing exploiting the inherent bias-learning tendency of diffusion models in data generation. Specifically, our approach adopts conditional diffusion models to generate synthetic bias-aligned images, which replace the original training set for learning an effective bias amplifier model that we subsequently incorporate into an end-to-end and a two-step unsupervised debiasing approach. By tackling the fundamental issue of bias-conflicting training samples memorization in learning auxiliary models, typical of this type of techniques, our proposed method beats current state-of-the-art in multiple benchmark datasets, demonstrating its potential as a versatile and effective tool for tackling bias in deep learning models. Code is available at https://github.com/Malga-Vision/DiffusingDeBias
