Table of Contents
Fetching ...

The impact of model size on catastrophic forgetting in Online Continual Learning

Eunhae Lee

TL;DR

This work investigates how model size (depth and width) impacts catastrophic forgetting in Online Continual Learning, challenging the assumption that larger models inherently improve continual adaptation. Using ResNet variants and the SplitCIFAR-10 benchmark under class-incremental, online and offline settings, it evaluates with Experience Replay and analyzes both quantitative metrics and qualitative saliency patterns. The study finds that larger models often underperform in CL, especially online, and that width-aware variants (e.g., Slim-ResNet18) can yield strong results despite reduced capacity. These findings have practical implications for deploying CL systems, highlighting nuanced scale-CL tradeoffs and prompting further research into scalable and robust continual learning strategies.

Abstract

This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.

The impact of model size on catastrophic forgetting in Online Continual Learning

TL;DR

This work investigates how model size (depth and width) impacts catastrophic forgetting in Online Continual Learning, challenging the assumption that larger models inherently improve continual adaptation. Using ResNet variants and the SplitCIFAR-10 benchmark under class-incremental, online and offline settings, it evaluates with Experience Replay and analyzes both quantitative metrics and qualitative saliency patterns. The study finds that larger models often underperform in CL, especially online, and that width-aware variants (e.g., Slim-ResNet18) can yield strong results despite reduced capacity. These findings have practical implications for deploying CL systems, highlighting nuanced scale-CL tradeoffs and prompting further research into scalable and robust continual learning strategies.

Abstract

This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.
Paper Structure (26 sections, 6 equations, 4 figures, 3 tables)

This paper contains 26 sections, 6 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Average Anytime Accuracy (AAA) of different sized ResNets in Online and Offline Continual Learning
  • Figure 2: Validation stream accuracy (Online CL)
  • Figure 3: Forgetting curves, Online CL (left) and Offline CL (right). Solid lines: Average Forgetting (AF); Dotted lines: Average Cumulative Forgetting (ACF)
  • Figure 4: Saliency map visualizations for Online CL