Table of Contents
Fetching ...

Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation

Guangtao Zheng, Wenqian Ye, Aidong Zhang

TL;DR

This paper tackles an annotation-free setting and proposes a self-guided spurious correlation mitigation framework that automatically constructs fine-grained training labels tailored for a classifier obtained with empirical risk minimization to improve its robustness against spurious correlations.

Abstract

Deep neural classifiers tend to rely on spurious correlations between spurious attributes of inputs and targets to make predictions, which could jeopardize their generalization capability. Training classifiers robust to spurious correlations typically relies on annotations of spurious correlations in data, which are often expensive to get. In this paper, we tackle an annotation-free setting and propose a self-guided spurious correlation mitigation framework. Our framework automatically constructs fine-grained training labels tailored for a classifier obtained with empirical risk minimization to improve its robustness against spurious correlations. The fine-grained training labels are formulated with different prediction behaviors of the classifier identified in a novel spuriousness embedding space. We construct the space with automatically detected conceptual attributes and a novel spuriousness metric which measures how likely a class-attribute correlation is exploited for predictions. We demonstrate that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori and outperforms prior methods on five real-world datasets.

Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation

TL;DR

This paper tackles an annotation-free setting and proposes a self-guided spurious correlation mitigation framework that automatically constructs fine-grained training labels tailored for a classifier obtained with empirical risk minimization to improve its robustness against spurious correlations.

Abstract

Deep neural classifiers tend to rely on spurious correlations between spurious attributes of inputs and targets to make predictions, which could jeopardize their generalization capability. Training classifiers robust to spurious correlations typically relies on annotations of spurious correlations in data, which are often expensive to get. In this paper, we tackle an annotation-free setting and propose a self-guided spurious correlation mitigation framework. Our framework automatically constructs fine-grained training labels tailored for a classifier obtained with empirical risk minimization to improve its robustness against spurious correlations. The fine-grained training labels are formulated with different prediction behaviors of the classifier identified in a novel spuriousness embedding space. We construct the space with automatically detected conceptual attributes and a novel spuriousness metric which measures how likely a class-attribute correlation is exploited for predictions. We demonstrate that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori and outperforms prior methods on five real-world datasets.
Paper Structure (36 sections, 8 equations, 6 figures, 12 tables, 1 algorithm)

This paper contains 36 sections, 8 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: Method overview. (a) Detecting attributes with a pre-trained VLM. (b) Quantifying the spuriousness of correlations between classes and detected attributes. (c) Clustering in the spuriousness embedding space for relabeling the training data. (d) Diversifying the outputs of the classifier and training the classifier with balanced training data.
  • Figure 2: (a) and (b): Spuriousness scores for the attributes detected from landbird and waterbird based on an ERM model. (d) and (e): Spuriousness scores based on our LBC model. (c) and (f): Spuriousness embeddings of the images in the Waterbirds dataset based on the ERM and LBC model, respectively.
  • Figure 3: Worst-group accuracy comparison of (a) leave-one-out study on the four proposed components and (b) analysis on the number of clusters $K$ on the Waterbirds dataset.
  • Figure 4: Examples of the generated text descriptions for images in the ImageNet-9 dataset.
  • Figure 5: Samples selected based on the two detected attributes, christmas tree and phone. Although these attributes are not self-explanatory in representing the selected samples, samples selected by them have some common characteristics.
  • ...and 1 more figures