The impact of model size on catastrophic forgetting in Online Continual Learning

Eunhae Lee

The impact of model size on catastrophic forgetting in Online Continual Learning

Eunhae Lee

TL;DR

This work investigates how model size (depth and width) impacts catastrophic forgetting in Online Continual Learning, challenging the assumption that larger models inherently improve continual adaptation. Using ResNet variants and the SplitCIFAR-10 benchmark under class-incremental, online and offline settings, it evaluates with Experience Replay and analyzes both quantitative metrics and qualitative saliency patterns. The study finds that larger models often underperform in CL, especially online, and that width-aware variants (e.g., Slim-ResNet18) can yield strong results despite reduced capacity. These findings have practical implications for deploying CL systems, highlighting nuanced scale-CL tradeoffs and prompting further research into scalable and robust continual learning strategies.

Abstract

This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.

The impact of model size on catastrophic forgetting in Online Continual Learning

TL;DR

Abstract

Paper Structure (26 sections, 6 equations, 4 figures, 3 tables)

This paper contains 26 sections, 6 equations, 4 figures, 3 tables.

Introduction
Related Work
Online Continual Learning
Continual Learning techniques
Model size and performance
Method
Problem definition
Task-agnostic and boundary-agnostic
Experience Replay (ER)
Benchmark
Metrics
Average Anytime Accuracy (AAA) caccia_new_2022
Average Cumulative Forgetting (ACF) soutif-cormerais_comprehensive_2023soutifcormerais2021importance
Average Accuracy (AA) and Average Forgetting (AF) mai_online_2021
Model selection
...and 11 more sections

Figures (4)

Figure 1: Average Anytime Accuracy (AAA) of different sized ResNets in Online and Offline Continual Learning
Figure 2: Validation stream accuracy (Online CL)
Figure 3: Forgetting curves, Online CL (left) and Offline CL (right). Solid lines: Average Forgetting (AF); Dotted lines: Average Cumulative Forgetting (ACF)
Figure 4: Saliency map visualizations for Online CL

The impact of model size on catastrophic forgetting in Online Continual Learning

TL;DR

Abstract

The impact of model size on catastrophic forgetting in Online Continual Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)