Table of Contents
Fetching ...

BLADE: Bias-Linked Adaptive DEbiasing

Piyush Arora, Navlika Singh, Vasubhya Diwan, Pratik Mazumder

TL;DR

Bias in neural networks often stems from spurious correlations in training data, leading to brittle generalization when contexts shift. BLADE addresses this without bias annotations or bias-conflicting samples by learning a bias-translation generator, aligning original and translated views at the instance level, and adaptively refining samples based on their susceptibility to bias. The framework combines a bias-translation module, instance-level alignment, and bias-invariant regularization with an adaptive refinement strategy, jointly optimizing a de-biased classifier and a bias-sensitive model. Across synthetic and real-world benchmarks, BLADE achieves state-of-the-art debiasing performance, including substantial gains under fully biased and multi-bias conditions, demonstrating robust generalization without supervision and a scalable approach for robust deep learning.

Abstract

Neural networks have revolutionized numerous fields, yet they remain vulnerable to a critical flaw: the tendency to learn implicit biases, spurious correlations between certain attributes and target labels in training data. These biases are often more prevalent and easier to learn, causing models to rely on superficial patterns rather than task-relevant features necessary for generalization. Existing methods typically rely on strong assumptions, such as prior knowledge of these biases or access to bias-conflicting samples, i.e., samples that contradict spurious correlations and counterbalance bias-aligned samples, samples that conform to these spurious correlations. However, such assumptions are often impractical in real-world settings. We propose BLADE ({B}ias-{L}inked {A}daptive {DE}biasing), a generative debiasing framework that requires no prior knowledge of bias or bias-conflicting samples. BLADE first trains a generative model to translate images across bias domains while preserving task-relevant features. Then, it adaptively refines each image with its synthetic counterpart based on the image's susceptibility to bias. To encourage robust representations, BLADE aligns an image with its bias-translated synthetic counterpart that shares task-relevant features but differs in bias, while misaligning it with samples sharing the same bias. We evaluate BLADE on multiple benchmark datasets and show that it significantly outperforms state-of-the-art methods. Notably, it exceeds the closest baseline by an absolute margin of around 18% on the corrupted CIFAR-10 dataset under the worst group setting, establishing a new benchmark in bias mitigation and demonstrating its potential for developing more robust deep learning models without explicit supervision.

BLADE: Bias-Linked Adaptive DEbiasing

TL;DR

Bias in neural networks often stems from spurious correlations in training data, leading to brittle generalization when contexts shift. BLADE addresses this without bias annotations or bias-conflicting samples by learning a bias-translation generator, aligning original and translated views at the instance level, and adaptively refining samples based on their susceptibility to bias. The framework combines a bias-translation module, instance-level alignment, and bias-invariant regularization with an adaptive refinement strategy, jointly optimizing a de-biased classifier and a bias-sensitive model. Across synthetic and real-world benchmarks, BLADE achieves state-of-the-art debiasing performance, including substantial gains under fully biased and multi-bias conditions, demonstrating robust generalization without supervision and a scalable approach for robust deep learning.

Abstract

Neural networks have revolutionized numerous fields, yet they remain vulnerable to a critical flaw: the tendency to learn implicit biases, spurious correlations between certain attributes and target labels in training data. These biases are often more prevalent and easier to learn, causing models to rely on superficial patterns rather than task-relevant features necessary for generalization. Existing methods typically rely on strong assumptions, such as prior knowledge of these biases or access to bias-conflicting samples, i.e., samples that contradict spurious correlations and counterbalance bias-aligned samples, samples that conform to these spurious correlations. However, such assumptions are often impractical in real-world settings. We propose BLADE ({B}ias-{L}inked {A}daptive {DE}biasing), a generative debiasing framework that requires no prior knowledge of bias or bias-conflicting samples. BLADE first trains a generative model to translate images across bias domains while preserving task-relevant features. Then, it adaptively refines each image with its synthetic counterpart based on the image's susceptibility to bias. To encourage robust representations, BLADE aligns an image with its bias-translated synthetic counterpart that shares task-relevant features but differs in bias, while misaligning it with samples sharing the same bias. We evaluate BLADE on multiple benchmark datasets and show that it significantly outperforms state-of-the-art methods. Notably, it exceeds the closest baseline by an absolute margin of around 18% on the corrupted CIFAR-10 dataset under the worst group setting, establishing a new benchmark in bias mitigation and demonstrating its potential for developing more robust deep learning models without explicit supervision.

Paper Structure

This paper contains 37 sections, 11 equations, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: (a) The model is trained on bias-translated generated samples. (b) Refined samples are created by adaptively mixing original and bias-translated generated samples based on bias-conflicting severity (BCS). (c) Bias Regularization loss is computed on samples translated to a common sampled bias domain, promoting invariance across bias-translated sample variants. (d) Instance De-biased Alignment loss is computed that encourages consistency between the original samples and their bias-translated counterparts.
  • Figure 2: t-SNE visualization comparing feature representations learned by BLADE and the vanilla model on the unbiased test set of Colored MNIST.
  • Figure 3: Grad-CAM heatmap visualizations on bFFHQ unbiased test set, comparing the Vanilla model and BLADE.
  • Figure 4: The figure showcases the ability of the modified StarGAN to translate images across bias domains. The first column represents original instances that are to be translated. The top row represents instances and their bias domain to which the original instances are translated into.