Table of Contents
Fetching ...

On the Foundations of Shortcut Learning

Katherine L. Hermann, Hossein Mobahi, Thomas Fel, Michael C. Mozer

TL;DR

This work investigates why deep nonlinear networks adopt shortcut features that are only spuriously predictive, introducing a generative framework with latent features $z_s$ (shortcut) and $z_c$ (core) whose predictivity $\rho_i$ and availability (via amplification $\alpha_i$ and nesting $\eta_i$) can be independently controlled. It demonstrates, through controlled synthetic data, synthetic-image experiments, and NTK-based theory, that linear nets are largely unbiased with respect to feature availability, while nonlinear networks exhibit a robust availability bias that can even dominate prediction when the shortcut is more available. The theory identifies the exact interaction between predictivity and availability, predicting that bias increases with depth and nonlinearity and that equal predictivity leads to a bias toward the more-available feature; empirical results on naturalistic datasets corroborate that non-core features like backgrounds can unduly influence vision models. Altogether, the paper argues that shortcut learning is a fundamental consequence of deep nonlinear architectures and provides a framework to analyze and potentially mitigate such biases in real-world data.

Abstract

Deep-learning models can extract a rich assortment of features from data. Which features a model uses depends not only on \emph{predictivity} -- how reliably a feature indicates training-set labels -- but also on \emph{availability} -- how easily the feature can be extracted from inputs. The literature on shortcut learning has noted examples in which models privilege one feature over another, for example texture over shape and image backgrounds over foreground objects. Here, we test hypotheses about which input properties are more available to a model, and systematically study how predictivity and availability interact to shape models' feature use. We construct a minimal, explicit generative framework for synthesizing classification datasets with two latent features that vary in predictivity and in factors we hypothesize to relate to availability, and we quantify a model's shortcut bias -- its over-reliance on the shortcut (more available, less predictive) feature at the expense of the core (less available, more predictive) feature. We find that linear models are relatively unbiased, but introducing a single hidden layer with ReLU or Tanh units yields a bias. Our empirical findings are consistent with a theoretical account based on Neural Tangent Kernels. Finally, we study how models used in practice trade off predictivity and availability in naturalistic datasets, discovering availability manipulations which increase models' degree of shortcut bias. Taken together, these findings suggest that the propensity to learn shortcut features is a fundamental characteristic of deep nonlinear architectures warranting systematic study given its role in shaping how models solve tasks.

On the Foundations of Shortcut Learning

TL;DR

This work investigates why deep nonlinear networks adopt shortcut features that are only spuriously predictive, introducing a generative framework with latent features (shortcut) and (core) whose predictivity and availability (via amplification and nesting ) can be independently controlled. It demonstrates, through controlled synthetic data, synthetic-image experiments, and NTK-based theory, that linear nets are largely unbiased with respect to feature availability, while nonlinear networks exhibit a robust availability bias that can even dominate prediction when the shortcut is more available. The theory identifies the exact interaction between predictivity and availability, predicting that bias increases with depth and nonlinearity and that equal predictivity leads to a bias toward the more-available feature; empirical results on naturalistic datasets corroborate that non-core features like backgrounds can unduly influence vision models. Altogether, the paper argues that shortcut learning is a fundamental consequence of deep nonlinear architectures and provides a framework to analyze and potentially mitigate such biases in real-world data.

Abstract

Deep-learning models can extract a rich assortment of features from data. Which features a model uses depends not only on \emph{predictivity} -- how reliably a feature indicates training-set labels -- but also on \emph{availability} -- how easily the feature can be extracted from inputs. The literature on shortcut learning has noted examples in which models privilege one feature over another, for example texture over shape and image backgrounds over foreground objects. Here, we test hypotheses about which input properties are more available to a model, and systematically study how predictivity and availability interact to shape models' feature use. We construct a minimal, explicit generative framework for synthesizing classification datasets with two latent features that vary in predictivity and in factors we hypothesize to relate to availability, and we quantify a model's shortcut bias -- its over-reliance on the shortcut (more available, less predictive) feature at the expense of the core (less available, more predictive) feature. We find that linear models are relatively unbiased, but introducing a single hidden layer with ReLU or Tanh units yields a bias. Our empirical findings are consistent with a theoretical account based on Neural Tangent Kernels. Finally, we study how models used in practice trade off predictivity and availability in naturalistic datasets, discovering availability manipulations which increase models' degree of shortcut bias. Taken together, these findings suggest that the propensity to learn shortcut features is a fundamental characteristic of deep nonlinear architectures warranting systematic study given its role in shaping how models solve tasks.
Paper Structure (27 sections, 6 theorems, 97 equations, 18 figures)

This paper contains 27 sections, 6 theorems, 97 equations, 18 figures.

Key Result

Theorem 1

Consider the kernel function $k(\boldsymbol{x}_1, \boldsymbol{x}_2) \,\triangleq\, \left\langle \boldsymbol{x}_1, \boldsymbol{x}_2 \right\rangle$. The kernel operator associated with $k$ under the data distribution $p$ specified above has only one non-zero eigenvalue $\lambda=\| \boldsymbol{A} \bold

Figures (18)

  • Figure 1: Synthetic data.A: Two datasets differing in the predictivity of $z_s$. B: Schematic of the embedding procedure manipulating availability via the mapping from $\boldsymbol{z}$ to $\boldsymbol{x}$. Dashed boxes are optional.
  • Figure 2: Deep nonlinear models can prefer a less-predictive but more-available feature to a more-predictive but less-available one. The color of each cell in the heatmap indicates the mean bias of a model as a function of the availability and predictivity of the shortcut feature, $z_s$. The inset shows in faint coloring the decision surface for an optimal Bayes classifier (LDA) and a trained model. Overlaid points are a subset of training instances. The model obtains a shortcut bias of $0.53$.
  • Figure 3: A:Model depth increases shortcut bias. The color of each cell indicates the mean bias of an MLP with ReLU hidden activation-functions, for various model widths and depths, trained on data with a shortcut feature that is more available ($\alpha_s/\alpha_c=64$) but less predictive ($\rho_s=0.85$) than the core feature. Model nonlinearity increases shortcut bias.B: Shortcut bias for three hidden activation functions for a deep MLP with width 128 and depth 2, trained on datasets where predictivity is matched ($\rho_s = \rho_c = 0.9$), but shortcut availability is higher ($\alpha_s/\alpha_c=32$). A shortcut bias is more pronounced when the model contains a nonlinear activation function. C: Shortcut bias for MLPs with a single hidden layer and a hidden activation function that is either linear (left) or ReLU (right), for various shortcut feature availabilities ($\alpha_s/\alpha_c$) and predictivities ($\rho_s$). See \ref{['fig:appx_tanh_single_hidden']} for Tanh.
  • Figure 4: ResNet-18 prefers a shortcut feature when availability is instantiated as the pixel footprint of an object (feature), even when that feature is less predictive.A: Sample images. B: Shortcut bias increases as a function of relative availability of the shortcut feature when features are equally predictive ($\rho_s =\rho_c = 0.9$), consistent with wolff2022signal. C: Even when the shortcut feature is less predictive, models have a shortcut bias due to availability, when $\alpha_s/\alpha_c = 4$.
  • Figure 5: Plot of $\mathop{\mathrm{sign}}\nolimits( (|\zeta_1| - |\zeta_2|) (a_1 - a_2) )$ for ReLU network as a function of $a_1$ and $a_2$. We fix $\mu_1=1$ and vary $\mu_2 \in \{0.1, 0.5, 1, 2,10 \}$. Yellow and blue correspond to values $+1$ and $-1$ respectively.
  • ...and 13 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6