Table of Contents
Fetching ...

Mitigating Spurious Correlations via Disagreement Probability

Hyeonggeun Han, Sehwan Kim, Hyungjun Joo, Sangwoo Hong, Jungwoo Lee

TL;DR

This work tackles the problem of spurious correlations in supervised learning when bias labels are unavailable. It introduces a bias-label-free objective and DPR, a disagreement-probability–based resampling method that upweights bias-conflicting samples using a deliberately biased model as a group proxy. The authors provide theoretical bounds showing that DPR reduces loss disparity between bias-aligned and bias-conflicting groups while lowering the average loss, and demonstrate state-of-the-art performance across six benchmarks, including challenging real-world datasets. The approach relies on a two-stage training process and calibration of a biased model, yet delivers practical gains in robustness and generalization to unseen data with minimal bias-label requirements.

Abstract

Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.

Mitigating Spurious Correlations via Disagreement Probability

TL;DR

This work tackles the problem of spurious correlations in supervised learning when bias labels are unavailable. It introduces a bias-label-free objective and DPR, a disagreement-probability–based resampling method that upweights bias-conflicting samples using a deliberately biased model as a group proxy. The authors provide theoretical bounds showing that DPR reduces loss disparity between bias-aligned and bias-conflicting groups while lowering the average loss, and demonstrate state-of-the-art performance across six benchmarks, including challenging real-world datasets. The approach relies on a two-stage training process and calibration of a biased model, yet delivers practical gains in robustness and generalization to unseen data with minimal bias-label requirements.

Abstract

Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.

Paper Structure

This paper contains 51 sections, 2 theorems, 22 equations, 8 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Suppose that the loss function $\ell(f_\theta(x), y)$ is upper-bounded by a constant $C>0$. Given two distinct groups $b_a \in \mathcal{B}$ and $b_c \in \mathcal{B}$ such that $b_a \neq b_c$, the following inequality holds with probability at least $1-\delta$, for any $\delta > 0$:

Figures (8)

  • Figure 1: An illustration of the cow/camel classification task. Red dotted boxes indicate samples where spurious correlations do not hold.
  • Figure 2: Distributions of disagreement probabilities for each sample within bias-aligned and bias-conflicting groups.
  • Figure 3: Average loss of randomly initialized, pretrained, and biased models on bias-aligned and bias-conflicting groups. The error bars represent the standard deviations over three trials.
  • Figure 4: Colored MNIST.
  • Figure 5: Multi-bias MNIST.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • proof
  • proof