Table of Contents
Fetching ...

Understanding the Failure Modes of Out-of-Distribution Generalization

Vaishnavh Nagarajan, Anders Andreassen, Behnam Neyshabur

TL;DR

The paper analyzes why ERM-based training fails to generalize to out-of-distribution domains even on easy tasks where invariant features fully determine the label. It identifies two fundamental failure modes—geometric skew and statistical skew— arising from spurious correlations and develops a gradient-descent-based linear-model framework to isolate and bound these effects. The authors derive theoretical insights and demonstrate them with MNIST/CIFAR-based experiments across simple models and modern architectures, showing that removing skews eliminates OoD failures in these settings. The work argues for a multifaceted approach to OoD generalization, as no single remedy addresses all identified failure modes, and lays a foundation for future algorithmic development in this area.

Abstract

Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.

Understanding the Failure Modes of Out-of-Distribution Generalization

TL;DR

The paper analyzes why ERM-based training fails to generalize to out-of-distribution domains even on easy tasks where invariant features fully determine the label. It identifies two fundamental failure modes—geometric skew and statistical skew— arising from spurious correlations and develops a gradient-descent-based linear-model framework to isolate and bound these effects. The authors derive theoretical insights and demonstrate them with MNIST/CIFAR-based experiments across simple models and modern architectures, showing that removing skews eliminates OoD failures in these settings. The work argues for a multifaceted approach to OoD generalization, as no single remedy addresses all identified failure modes, and lays a foundation for future algorithmic development in this area.

Abstract

Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.

Paper Structure

This paper contains 27 sections, 7 theorems, 47 equations, 74 figures.

Key Result

Theorem 1

(informal) Let $\mathbb{H}$ be the set of linear classifiers, $h(x) = {\mathbf{w}}_{\textup{inv}} {\mathbf{x}}_{\textup{inv}} + w_{\textup{sp}{}} x_{\textup{sp}{}} + b$. Then for any task satisfying all the constraints in Sec sec:easy-to-learn with ${\mathcal{B}}=1$, the max-margin classifier satisf

Figures (74)

  • Figure 1: Earlier work: Partially predictive invariant feature
  • Figure 2: This work:Fully predictive invariant feature
  • Figure 3: ResNet on CIFAR10 with spuriously colored lines
  • Figure 4: OoD accuracy drop despite no color-label correlation
  • Figure 6: OOD failure in Sec \ref{['sec:framework']} Binary-MNIST example
  • ...and 69 more figures

Theorems & Definitions (16)

  • Theorem 1
  • Theorem 2
  • Proposition 1
  • proof
  • Theorem 3
  • Remark 1
  • Remark 2
  • proof
  • Theorem 4
  • proof
  • ...and 6 more