Table of Contents
Fetching ...

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh, Tinne Tuytelaars

TL;DR

This survey addresses catastrophic forgetting in sequential task learning by offering a taxonomy of replay, regularization-based, and parameter isolation methods, and by introducing a continual hyperparameter framework that balances stability and plasticity using only current-task data. It provides a thorough empirical comparison of 11 methods across three realistic datasets, revealing how model capacity, regularization choices, and task order influence performance. Key findings show that parameter isolation methods like PackNet excel in balanced task sequences but can hit capacity limits, while regularization-based approaches offer robustness when hyperparameters are tuned via the proposed framework; unbalanced real-world tasks challenge several methods, emphasizing the value of isolation and careful capacity management. The results yield practical guidance for method selection and highlight open challenges in extending continual learning beyond simplistic task-incremental, multi-head setups to more general, online, and privacy-conscious scenarios.

Abstract

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

A continual learning survey: Defying forgetting in classification tasks

TL;DR

This survey addresses catastrophic forgetting in sequential task learning by offering a taxonomy of replay, regularization-based, and parameter isolation methods, and by introducing a continual hyperparameter framework that balances stability and plasticity using only current-task data. It provides a thorough empirical comparison of 11 methods across three realistic datasets, revealing how model capacity, regularization choices, and task order influence performance. Key findings show that parameter isolation methods like PackNet excel in balanced task sequences but can hit capacity limits, while regularization-based approaches offer robustness when hyperparameters are tuned via the proposed framework; unbalanced real-world tasks challenge several methods, emphasizing the value of isolation and careful capacity management. The results yield practical guidance for method selection and highlight open challenges in extending continual learning beyond simplistic task-incremental, multi-head setups to more general, online, and privacy-conscious scenarios.

Abstract

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

Paper Structure

This paper contains 44 sections, 7 equations, 10 figures, 20 tables, 1 algorithm.

Figures (10)

  • Figure 1: A tree diagram illustrating the different continual learning families of methods and the different branches within each family. The leaves enlist example methods.
  • Figure 2: Parameter isolation and regularization-based methods (top) and replay methods (bottom) on Tiny Imagenet for the base model with random ordering, reporting average accuracy (forgetting) in the legend.
  • Figure 3: RecogSeq dataset sequence results, reporting average accuracy (forgetting).
  • Figure 4: Continual learning methods accuracy plots for 3 different orderings of the iNaturalist dataset.
  • Figure 5: Main setup of related machine learning fields, illustrating the differences with general continual learning settings.
  • ...and 5 more figures