Table of Contents
Fetching ...

Sebra: Debiasing Through Self-Guided Bias Ranking

Adarsh Kappiyath, Abhra Chaudhuri, Ajay Jaiswal, Ziquan Liu, Yunpeng Li, Xiatian Zhu, Lu Yin

TL;DR

This work tackles bias-ridden subpopulation shifts by proposing Sebra, a self-guided, unsupervised spuriosity ranking method that exploits a local hardness–spuriosity symmetry in ERM to order data within each class from highly spurious to less spurious. By dynamically steering ERM with selection and upweighting signals, Sebra yields a fine-grained spuriosity ranking used to form informative contrastive pairs for debiasing. Empirical results on UrbanCars, CelebA, BAR, and ImageNet-1K show that Sebra surpasses state-of-the-art unsupervised debiasing methods and is competitive with supervised baselines, while never requiring bias annotations. The approach provides a scalable, model-agnostic pathway to mitigate multiple biases and offers avenues for extending unsupervised bias discovery and robust representation learning.

Abstract

Ranking samples by fine-grained estimates of spuriosity (the degree to which spurious cues are present) has recently been shown to significantly benefit bias mitigation, over the traditional binary biased-\textit{vs}-unbiased partitioning of train sets. However, this spuriosity ranking comes with the requirement of human supervision. In this paper, we propose a debiasing framework based on our novel \ul{Se}lf-Guided \ul{B}ias \ul{Ra}nking (\emph{Sebra}), that mitigates biases (spurious correlations) via an automatic ranking of data points by spuriosity within their respective classes. Sebra leverages a key local symmetry in Empirical Risk Minimization (ERM) training -- the ease of learning a sample via ERM inversely correlates with its spuriousity; the fewer spurious correlations a sample exhibits, the harder it is to learn, and vice versa. However, globally across iterations, ERM tends to deviate from this symmetry. Sebra dynamically steers ERM to correct this deviation, facilitating the sequential learning of attributes in increasing order of difficulty, \ie, decreasing order of spuriosity. As a result, the sequence in which Sebra learns samples naturally provides spuriousity rankings. We use the resulting fine-grained bias characterization in a contrastive learning framework to mitigate biases from multiple sources. Extensive experiments show that Sebra consistently outperforms previous state-of-the-art unsupervised debiasing techniques across multiple standard benchmarks, including UrbanCars, BAR, CelebA, and ImageNet-1K. Code, pre-trained models, and training logs are available at https://kadarsh22.github.io/sebra_iclr25/.

Sebra: Debiasing Through Self-Guided Bias Ranking

TL;DR

This work tackles bias-ridden subpopulation shifts by proposing Sebra, a self-guided, unsupervised spuriosity ranking method that exploits a local hardness–spuriosity symmetry in ERM to order data within each class from highly spurious to less spurious. By dynamically steering ERM with selection and upweighting signals, Sebra yields a fine-grained spuriosity ranking used to form informative contrastive pairs for debiasing. Empirical results on UrbanCars, CelebA, BAR, and ImageNet-1K show that Sebra surpasses state-of-the-art unsupervised debiasing methods and is competitive with supervised baselines, while never requiring bias annotations. The approach provides a scalable, model-agnostic pathway to mitigate multiple biases and offers avenues for extending unsupervised bias discovery and robust representation learning.

Abstract

Ranking samples by fine-grained estimates of spuriosity (the degree to which spurious cues are present) has recently been shown to significantly benefit bias mitigation, over the traditional binary biased-\textit{vs}-unbiased partitioning of train sets. However, this spuriosity ranking comes with the requirement of human supervision. In this paper, we propose a debiasing framework based on our novel \ul{Se}lf-Guided \ul{B}ias \ul{Ra}nking (\emph{Sebra}), that mitigates biases (spurious correlations) via an automatic ranking of data points by spuriosity within their respective classes. Sebra leverages a key local symmetry in Empirical Risk Minimization (ERM) training -- the ease of learning a sample via ERM inversely correlates with its spuriousity; the fewer spurious correlations a sample exhibits, the harder it is to learn, and vice versa. However, globally across iterations, ERM tends to deviate from this symmetry. Sebra dynamically steers ERM to correct this deviation, facilitating the sequential learning of attributes in increasing order of difficulty, \ie, decreasing order of spuriosity. As a result, the sequence in which Sebra learns samples naturally provides spuriousity rankings. We use the resulting fine-grained bias characterization in a contrastive learning framework to mitigate biases from multiple sources. Extensive experiments show that Sebra consistently outperforms previous state-of-the-art unsupervised debiasing techniques across multiple standard benchmarks, including UrbanCars, BAR, CelebA, and ImageNet-1K. Code, pre-trained models, and training logs are available at https://kadarsh22.github.io/sebra_iclr25/.

Paper Structure

This paper contains 30 sections, 1 theorem, 34 equations, 11 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Iff the spuriosity measure $u_i^* = e^{-t(\mathcal{L}_\text{CE}(f_{\theta}(x_i), y_i)}$, where $t(x)$ is a monotonically increasing function of $x$, the variable $u_i$ in eqn_g_u, across all values of $\mathcal{L}_\text{CE}(f_{\theta}(x_i), y_i)$, satisfies the following conservation law: such that $u_i^*$ is the minimizer of the conserved function.

Figures (11)

  • Figure 1: In each step of Self-guided bias ranking (Sebra), datapoints are upweighted with $u_{i}$ and then trained via ERM. Following this, we estimate $v_{i}$ for each sample to select them for subsequent training. Samples for which $v_{i}$ transitions from 1 to 0 are ranked at each step and eliminated from subsequent training. Any unranked samples are appended to the ranked list at the end of the training phase. In the mitigation phase, negative pairs are formed using samples with the same rank, while positive pairs are obtained using samples with a higher rank than the reference samples.
  • Figure 2: Training dynamics of Sebra and ERM monitored in terms of accuracies of background bias (left), co-occurring object bias (center), and core attribute (right).
  • Figure 3: Dataset samples: Images from various datasets with multiple spurious correlations used in our experiments are shown below. For CelebA and UrbanCars dataset each column depicts multiple groups categorised based on biased features, as well as their proportions in the training set, each row displays samples from various classes. Images at the bottom demonstrates samples from BAR dataset from 6 classes. The images with red border lines belong to BAR evaluation set, and others belong to BAR training set.
  • Figure 4: Top 5 spurious concepts discovered using Spuriosity rankings introduced in moayeri2023spuriosity. As observed, the identified neurons capture only a subset of features corresponding to the spurious attribute 'background'; thus, ranking relying on top-k highly activating neurons would only rely on partial characteristics of spurious features.
  • Figure 5: Qualitative Analysis on UrbanCars Dataset: Examples of top-ranked (high-spurious) and bottom-ranked(low-ranked) samples as ranked by Sebra, showcasing a range of samples from both the classes.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Definition 1
  • Definition 2: Attribute Types and Spuriosity Ranking
  • Theorem 1: Hardness-Spuriosity Conservation
  • proof