The impact of model size on catastrophic forgetting in Online Continual Learning
Eunhae Lee
TL;DR
This work investigates how model size (depth and width) impacts catastrophic forgetting in Online Continual Learning, challenging the assumption that larger models inherently improve continual adaptation. Using ResNet variants and the SplitCIFAR-10 benchmark under class-incremental, online and offline settings, it evaluates with Experience Replay and analyzes both quantitative metrics and qualitative saliency patterns. The study finds that larger models often underperform in CL, especially online, and that width-aware variants (e.g., Slim-ResNet18) can yield strong results despite reduced capacity. These findings have practical implications for deploying CL systems, highlighting nuanced scale-CL tradeoffs and prompting further research into scalable and robust continual learning strategies.
Abstract
This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.
