BLADE: Bias-Linked Adaptive DEbiasing
Piyush Arora, Navlika Singh, Vasubhya Diwan, Pratik Mazumder
TL;DR
Bias in neural networks often stems from spurious correlations in training data, leading to brittle generalization when contexts shift. BLADE addresses this without bias annotations or bias-conflicting samples by learning a bias-translation generator, aligning original and translated views at the instance level, and adaptively refining samples based on their susceptibility to bias. The framework combines a bias-translation module, instance-level alignment, and bias-invariant regularization with an adaptive refinement strategy, jointly optimizing a de-biased classifier and a bias-sensitive model. Across synthetic and real-world benchmarks, BLADE achieves state-of-the-art debiasing performance, including substantial gains under fully biased and multi-bias conditions, demonstrating robust generalization without supervision and a scalable approach for robust deep learning.
Abstract
Neural networks have revolutionized numerous fields, yet they remain vulnerable to a critical flaw: the tendency to learn implicit biases, spurious correlations between certain attributes and target labels in training data. These biases are often more prevalent and easier to learn, causing models to rely on superficial patterns rather than task-relevant features necessary for generalization. Existing methods typically rely on strong assumptions, such as prior knowledge of these biases or access to bias-conflicting samples, i.e., samples that contradict spurious correlations and counterbalance bias-aligned samples, samples that conform to these spurious correlations. However, such assumptions are often impractical in real-world settings. We propose BLADE ({B}ias-{L}inked {A}daptive {DE}biasing), a generative debiasing framework that requires no prior knowledge of bias or bias-conflicting samples. BLADE first trains a generative model to translate images across bias domains while preserving task-relevant features. Then, it adaptively refines each image with its synthetic counterpart based on the image's susceptibility to bias. To encourage robust representations, BLADE aligns an image with its bias-translated synthetic counterpart that shares task-relevant features but differs in bias, while misaligning it with samples sharing the same bias. We evaluate BLADE on multiple benchmark datasets and show that it significantly outperforms state-of-the-art methods. Notably, it exceeds the closest baseline by an absolute margin of around 18% on the corrupted CIFAR-10 dataset under the worst group setting, establishing a new benchmark in bias mitigation and demonstrating its potential for developing more robust deep learning models without explicit supervision.
