Table of Contents
Fetching ...

Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

Kai Gu, Weishi Shi

TL;DR

The paper investigates how continual learners can become rigid due to shortcut features—incidental cues that align with labels but lack causal meaning. It introduces the Einstelllung Rigidity Index (ERI), a compact diagnostic capturing Adaptation Delay ($AD$), Performance Deficit ($PD$), and Relative Suboptimal Feature Reliance ($SFR_{rel}$) in a two-phase CIFAR-100 setup with a deterministic patch. By comparing several CL methods against a Scratch_T2 baseline and applying a masking intervention, the study shows that CL models often adapt faster (negative $AD$) but do not consistently outperform Scratch_T2 on patched data (small positive $PD$) and tend to exhibit cue sensitivity ($SFR_{rel}$), consistent with the patch acting as a distractor rather than a beneficial shortcut. ERI thus serves as a practical screening tool to distinguish transfer from cue-driven performance and guides further probes or mitigations, with implications for robust evaluation and design of continual learning systems across datasets and modalities.

Abstract

Deep neural networks frequently exploit shortcut features, defined as incidental correlations between inputs and labels without causal meaning. Shortcut features undermine robustness and reduce reliability under distribution shifts. In continual learning (CL), the consequences of shortcut exploitation can persist and intensify: weights inherited from earlier tasks bias representation reuse toward whatever features most easily satisfied prior labels, mirroring the cognitive Einstellung effect, a phenomenon where past habits block optimal solutions. Whereas catastrophic forgetting erodes past skills, shortcut-induced rigidity throttles the acquisition of new ones. We introduce the Einstellung Rigidity Index (ERI), a compact diagnostic that disentangles genuine transfer from cue-inflated performance using three interpretable facets: (i) Adaptation Delay (AD), (ii) Performance Deficit (PD), and (iii) Relative Suboptimal Feature Reliance (SFR_rel). On a two-phase CIFAR-100 CL benchmark with a deliberately spurious magenta patch in Phase 2, we evaluate Naive fine-tuning (SGD), online Elastic Weight Consolidation (EWC_on), Dark Experience Replay (DER++), Gradient Projection Memory (GPM), and Deep Generative Replay (DGR). Across these continual learning methods, we observe that CL methods reach accuracy thresholds earlier than a Scratch-T2 baseline (negative AD) but achieve slightly lower final accuracy on patched shortcut classes (positive PD). Masking the patch improves accuracy for CL methods while slightly reducing Scratch-T2, yielding negative SFR_rel. This pattern indicates the patch acted as a distractor for CL models in this setting rather than a helpful shortcut.

Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

TL;DR

The paper investigates how continual learners can become rigid due to shortcut features—incidental cues that align with labels but lack causal meaning. It introduces the Einstelllung Rigidity Index (ERI), a compact diagnostic capturing Adaptation Delay (), Performance Deficit (), and Relative Suboptimal Feature Reliance () in a two-phase CIFAR-100 setup with a deterministic patch. By comparing several CL methods against a Scratch_T2 baseline and applying a masking intervention, the study shows that CL models often adapt faster (negative ) but do not consistently outperform Scratch_T2 on patched data (small positive ) and tend to exhibit cue sensitivity (), consistent with the patch acting as a distractor rather than a beneficial shortcut. ERI thus serves as a practical screening tool to distinguish transfer from cue-driven performance and guides further probes or mitigations, with implications for robust evaluation and design of continual learning systems across datasets and modalities.

Abstract

Deep neural networks frequently exploit shortcut features, defined as incidental correlations between inputs and labels without causal meaning. Shortcut features undermine robustness and reduce reliability under distribution shifts. In continual learning (CL), the consequences of shortcut exploitation can persist and intensify: weights inherited from earlier tasks bias representation reuse toward whatever features most easily satisfied prior labels, mirroring the cognitive Einstellung effect, a phenomenon where past habits block optimal solutions. Whereas catastrophic forgetting erodes past skills, shortcut-induced rigidity throttles the acquisition of new ones. We introduce the Einstellung Rigidity Index (ERI), a compact diagnostic that disentangles genuine transfer from cue-inflated performance using three interpretable facets: (i) Adaptation Delay (AD), (ii) Performance Deficit (PD), and (iii) Relative Suboptimal Feature Reliance (SFR_rel). On a two-phase CIFAR-100 CL benchmark with a deliberately spurious magenta patch in Phase 2, we evaluate Naive fine-tuning (SGD), online Elastic Weight Consolidation (EWC_on), Dark Experience Replay (DER++), Gradient Projection Memory (GPM), and Deep Generative Replay (DGR). Across these continual learning methods, we observe that CL methods reach accuracy thresholds earlier than a Scratch-T2 baseline (negative AD) but achieve slightly lower final accuracy on patched shortcut classes (positive PD). Masking the patch improves accuracy for CL methods while slightly reducing Scratch-T2, yielding negative SFR_rel. This pattern indicates the patch acted as a distractor for CL models in this setting rather than a helpful shortcut.

Paper Structure

This paper contains 29 sections, 7 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Shortcut injection and masking protocol. Left: original CIFAR-100 images. Middle: shortcut added (magenta $4\times4$ patch in the top-left). Right: masked evaluation replaces the patch with a uniform black square. Both color and grayscale examples are shown to emphasize cue salience under photometric variation.
  • Figure 2: Sample augmentations used in both T1 and T2. Left: raw images. Right: augmented versions (random crop and horizontal flip). The shortcut patch is injected after these transforms, ensuring its location is stable in the final image.
  • Figure 3: Panel A: Shortcut accuracy vs. effective epochs on $SC$ (patched). AD annotations show large negative $\mathrm{AD}$ for all continual learners that cross $\tau{=}0.6$; DGR does not cross within 50 epochs.
  • Figure 4: AD sensitivity analysis on $SC$. Negative indicates earlier adaptation than Scratch_T2; positive indicates slower adaptation. Hatched cells denote no threshold crossing within the 50-epoch budget.
  • Figure 5: Panel B: Performance Deficit over time, $\mathrm{PD}_t = A_S - A_{CL}$ (SC, patched). Negative values indicate the CL method outperforms Scratch_T2.
  • ...and 1 more figures