Table of Contents
Fetching ...

Invariant Learning with Annotation-free Environments

Phuong Quynh Le, Christin Seifert, Jörg Schlötterer

TL;DR

The paper tackles domain generalization under spurious correlations by removing the need for annotated environments. It leverages clustering in the representation space of an ERM model to identify conflict samples that counter the training spurious correlations, building two annotation-free environments for invariant risk minimization. The proposed method achieves competitive performance on ColoredMNIST, strengthening invariant learning without restricting the reference model, and demonstrates robustness across varying spurious correlations. The work suggests a practical direction for scalable invariant learning and calls for future exploration into multi-class settings and multiple concurrent spurious features.

Abstract

Invariant learning is a promising approach to improve domain generalization compared to Empirical Risk Minimization (ERM). However, most invariant learning methods rely on the assumption that training examples are pre-partitioned into different known environments. We instead infer environments without the need for additional annotations, motivated by observations of the properties within the representation space of a trained ERM model. We show the preliminary effectiveness of our approach on the ColoredMNIST benchmark, achieving performance comparable to methods requiring explicit environment labels and on par with an annotation-free method that poses strong restrictions on the ERM reference model.

Invariant Learning with Annotation-free Environments

TL;DR

The paper tackles domain generalization under spurious correlations by removing the need for annotated environments. It leverages clustering in the representation space of an ERM model to identify conflict samples that counter the training spurious correlations, building two annotation-free environments for invariant risk minimization. The proposed method achieves competitive performance on ColoredMNIST, strengthening invariant learning without restricting the reference model, and demonstrates robustness across varying spurious correlations. The work suggests a practical direction for scalable invariant learning and calls for future exploration into multi-class settings and multiple concurrent spurious features.

Abstract

Invariant learning is a promising approach to improve domain generalization compared to Empirical Risk Minimization (ERM). However, most invariant learning methods rely on the assumption that training examples are pre-partitioned into different known environments. We instead infer environments without the need for additional annotations, motivated by observations of the properties within the representation space of a trained ERM model. We show the preliminary effectiveness of our approach on the ColoredMNIST benchmark, achieving performance comparable to methods requiring explicit environment labels and on par with an annotation-free method that poses strong restrictions on the ERM reference model.

Paper Structure

This paper contains 10 sections, 3 equations, 2 figures.

Figures (2)

  • Figure 1: (a) Example instances from ColoredMNIST with colors correlated with binary class labels. t-SNE tSNE:2008:JMLR projected embedding space with colors representing (b) color annotations and (c) clusters obtained by k-means clustering. (d) Cluster purity w.r.t. spurious features (S-purity) and classes (C-purity).
  • Figure 2: Predictive performance (average over 10 runs) on (a) the standard test set. ✓ indicates that the method requires annotations of the environments, whereas ✗ does not; ✗ ✗ for the Oracle means the method is trained and evaluated on gray images with $n_y=0.25$. We highlight models using our sampling approach. * indicates values from the original paper. (b) varying test environments.