Table of Contents
Fetching ...

Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations

GuanWen Qiu, Da Kuang, Surbhi Goel

TL;DR

The paper addresses why neural networks tend to learn spurious, easier-to-learn features and how this affects learning dynamics of core, invariant features. It introduces a Boolean-function–based framework with a core feature f_c and a spurious feature f_s, controlled by complexity and confounder strength λ, and analyzes gradient dynamics of a two-layer ReLU network under SGD across parity and staircase functions. Key contributions include (i) empirical R1–R5 observations detailing slowed core learning, two-subnetwork formation, memorization of spurious features, effectiveness of Last Layer Retraining, and limitations of common debiasing methods; (ii) a theoretical Fourier-gap-based explanation for spurious-first learning, slow core learning, and persistence of spurious features; and (iii) demonstrations that the framework aligns with semi-synthetic and real datasets while offering precise controls over feature complexity. The findings clarify when debiasing strategies like LLR help and reveal limitations of popular algorithms in more general settings, providing a rigorous benchmark and guidance for designing robust training procedures against spurious correlations.

Abstract

Existing research often posits spurious features as easier to learn than core features in neural network optimization, but the impact of their relative simplicity remains under-explored. Moreover, studies mainly focus on end performance rather than the learning dynamics of feature learning. In this paper, we propose a theoretical framework and an associated synthetic dataset grounded in boolean function analysis. This setup allows for fine-grained control over the relative complexity (compared to core features) and correlation strength (with respect to the label) of spurious features to study the dynamics of feature learning under spurious correlations. Our findings uncover several interesting phenomena: (1) stronger spurious correlations or simpler spurious features slow down the learning rate of the core features, (2) two distinct subnetworks are formed to learn core and spurious features separately, (3) learning phases of spurious and core features are not always separable, (4) spurious features are not forgotten even after core features are fully learned. We demonstrate that our findings justify the success of retraining the last layer to remove spurious correlation and also identifies limitations of popular debiasing algorithms that exploit early learning of spurious features. We support our empirical findings with theoretical analyses for the case of learning XOR features with a one-hidden-layer ReLU network.

Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations

TL;DR

The paper addresses why neural networks tend to learn spurious, easier-to-learn features and how this affects learning dynamics of core, invariant features. It introduces a Boolean-function–based framework with a core feature f_c and a spurious feature f_s, controlled by complexity and confounder strength λ, and analyzes gradient dynamics of a two-layer ReLU network under SGD across parity and staircase functions. Key contributions include (i) empirical R1–R5 observations detailing slowed core learning, two-subnetwork formation, memorization of spurious features, effectiveness of Last Layer Retraining, and limitations of common debiasing methods; (ii) a theoretical Fourier-gap-based explanation for spurious-first learning, slow core learning, and persistence of spurious features; and (iii) demonstrations that the framework aligns with semi-synthetic and real datasets while offering precise controls over feature complexity. The findings clarify when debiasing strategies like LLR help and reveal limitations of popular algorithms in more general settings, providing a rigorous benchmark and guidance for designing robust training procedures against spurious correlations.

Abstract

Existing research often posits spurious features as easier to learn than core features in neural network optimization, but the impact of their relative simplicity remains under-explored. Moreover, studies mainly focus on end performance rather than the learning dynamics of feature learning. In this paper, we propose a theoretical framework and an associated synthetic dataset grounded in boolean function analysis. This setup allows for fine-grained control over the relative complexity (compared to core features) and correlation strength (with respect to the label) of spurious features to study the dynamics of feature learning under spurious correlations. Our findings uncover several interesting phenomena: (1) stronger spurious correlations or simpler spurious features slow down the learning rate of the core features, (2) two distinct subnetworks are formed to learn core and spurious features separately, (3) learning phases of spurious and core features are not always separable, (4) spurious features are not forgotten even after core features are fully learned. We demonstrate that our findings justify the success of retraining the last layer to remove spurious correlation and also identifies limitations of popular debiasing algorithms that exploit early learning of spurious features. We support our empirical findings with theoretical analyses for the case of learning XOR features with a one-hidden-layer ReLU network.
Paper Structure (59 sections, 20 theorems, 35 equations, 29 figures, 5 tables)

This paper contains 59 sections, 20 theorems, 35 equations, 29 figures, 5 tables.

Key Result

Lemma 1

Let $\xi_k = \widehat{\mathsf{Maj}}([k])$ be the $k$-th Fourier coefficient of the $n=c+s+u$ variable Majority function. At initialization, there is a set of neurons such that the population gradient gap on the variables compared to the irrelevant variablesThese quantities are negative because they

Figures (29)

  • Figure 1: A comparison of our dataset with the domino image dataset. Here $\lambda=0.75$. We take both $f_s$ and $f_c$ to be parity function. Dark grey square on a boolean vector denote $1$ and light grey square denote $-1$.
  • Figure 2: Core/spurious correlation and decoded correlation dynamics of different datasets. Leftmost figure shows the fourier coefficients of both the spurious and core function are fitted from low (light color) to high (deep color) for the staircase function. All of the experiments have $\lambda = 0.9$. Staircase: $\deg(f_s)=7, \deg(f_c)=14$; Parity: $\deg(f_s)=4, \deg(f_c)=10$; CIFAR-MINST: (c) Truck-car (s) 01.
  • Figure 3: Influence of confounder strength and complexity of spurious correlation on learning of core features. The $y$-axis shows the number of epochs required to reach $0.95$ core correlation. The $0$ degree bar indicate the epochs required to learn core feature when spurious correlation is not present i.e $\lambda=0.5$. Each bar of the left two plots is based on 30 repetitions of experiments.
  • Figure 4: Each plot here shows the weight dynamic throughout training within a single selected neuron and each curve here corresponding to the weight dynamic on a single coordinate. The left two plots are for staircase with $\text{deg}(f_s)=7, \text{deg}(f_c)=14$ and the right two plots are for parity with $\text{deg}(f_s)=4, \text{deg}(f_c)=10$. For both experiment, $\lambda=0.9$. We see the neurons are separated into core and spurious neurons. Spurious neurons remain focus on learning spurious feature and core neuron eventually emerge and learns the core feature.
  • Figure 5: The two plots are produced by running experiments on parity cases under different $\lambda$, the focus here is on the decoded spurious correlation. The experiments are run under $\text{deg}(f_s)=4, \text{deg}(f_c)=10$. Left: we see when $\lambda$ is low, the spurious feature is being forgotten in the later stage of training. Right: when $\lambda$ is relatively high, the spurious feature is memorized once it is learned.
  • ...and 24 more figures

Theorems & Definitions (32)

  • Lemma 1: informal
  • Theorem 1: informal, barak_hidden_2023
  • Lemma 2
  • Lemma 3: informal
  • Lemma 4: informal
  • Theorem A.1
  • Lemma 5
  • proof
  • Lemma A.1
  • proof
  • ...and 22 more