Table of Contents
Fetching ...

Realistic Continual Learning Approach using Pre-trained Models

Nadia Nasri, Carlos Gutiérrez-Álvarez, Sergio Lafuente-Arroyo, Saturnino Maldonado-Bascón, Roberto J. López-Sastre

TL;DR

RealCL reframes continual learning as learning from random, non-stationary task streams with no access to past data, formalizing a general objective over all seen classes after each task. The paper proposes CLARE, a simple, model-agnostic approach that freezes a pre-trained encoder (e.g., CLIP visual branch) and couples it with a Dynamic Neural Adaptation Network (Dyn-NAN) and a memory module to absorb new classes while preserving prior knowledge; the Dyn-NAN adapts its output heads to the growing class set ${\\mathcal{Y}_k}$, with predictions given by $\hat{y}_k(x)=\arg\max_{y\in\mathcal{Y}_k} f_k(x)[y]$. RealCL subsumes traditional CL settings (class-incremental, domain-incremental, task-incremental) as special cases, enabling evaluation across highly structured to fully random task streams; experiments on CIFAR-10, CIFAR-100, and TinyImagenet show CLARE achieves competitive Last Task Accuracy while minimizing forgetting, especially as memory size grows. The findings highlight the value of pre-trained representations for robust continual learning in unpredictable environments and position RealCL as a more realistic benchmark for future CL research. Overall, CLARE demonstrates that a frozen PTM backbone plus a small dynamic head and a memory buffer can effectively integrate new knowledge without extensive retraining, making it practical for real-world continual learning scenarios.

Abstract

Continual learning (CL) is crucial for evaluating adaptability in learning solutions to retain knowledge. Our research addresses the challenge of catastrophic forgetting, where models lose proficiency in previously learned tasks as they acquire new ones. While numerous solutions have been proposed, existing experimental setups often rely on idealized class-incremental learning scenarios. We introduce Realistic Continual Learning (RealCL), a novel CL paradigm where class distributions across tasks are random, departing from structured setups. We also present CLARE (Continual Learning Approach with pRE-trained models for RealCL scenarios), a pre-trained model-based solution designed to integrate new knowledge while preserving past learning. Our contributions include pioneering RealCL as a generalization of traditional CL setups, proposing CLARE as an adaptable approach for RealCL tasks, and conducting extensive experiments demonstrating its effectiveness across various RealCL scenarios. Notably, CLARE outperforms existing models on RealCL benchmarks, highlighting its versatility and robustness in unpredictable learning environments.

Realistic Continual Learning Approach using Pre-trained Models

TL;DR

RealCL reframes continual learning as learning from random, non-stationary task streams with no access to past data, formalizing a general objective over all seen classes after each task. The paper proposes CLARE, a simple, model-agnostic approach that freezes a pre-trained encoder (e.g., CLIP visual branch) and couples it with a Dynamic Neural Adaptation Network (Dyn-NAN) and a memory module to absorb new classes while preserving prior knowledge; the Dyn-NAN adapts its output heads to the growing class set , with predictions given by . RealCL subsumes traditional CL settings (class-incremental, domain-incremental, task-incremental) as special cases, enabling evaluation across highly structured to fully random task streams; experiments on CIFAR-10, CIFAR-100, and TinyImagenet show CLARE achieves competitive Last Task Accuracy while minimizing forgetting, especially as memory size grows. The findings highlight the value of pre-trained representations for robust continual learning in unpredictable environments and position RealCL as a more realistic benchmark for future CL research. Overall, CLARE demonstrates that a frozen PTM backbone plus a small dynamic head and a memory buffer can effectively integrate new knowledge without extensive retraining, making it practical for real-world continual learning scenarios.

Abstract

Continual learning (CL) is crucial for evaluating adaptability in learning solutions to retain knowledge. Our research addresses the challenge of catastrophic forgetting, where models lose proficiency in previously learned tasks as they acquire new ones. While numerous solutions have been proposed, existing experimental setups often rely on idealized class-incremental learning scenarios. We introduce Realistic Continual Learning (RealCL), a novel CL paradigm where class distributions across tasks are random, departing from structured setups. We also present CLARE (Continual Learning Approach with pRE-trained models for RealCL scenarios), a pre-trained model-based solution designed to integrate new knowledge while preserving past learning. Our contributions include pioneering RealCL as a generalization of traditional CL setups, proposing CLARE as an adaptable approach for RealCL tasks, and conducting extensive experiments demonstrating its effectiveness across various RealCL scenarios. Notably, CLARE outperforms existing models on RealCL benchmarks, highlighting its versatility and robustness in unpredictable learning environments.
Paper Structure (16 sections, 3 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 16 sections, 3 equations, 11 figures, 5 tables, 1 algorithm.

Figures (11)

  • Figure 1: In this figure, we illustrate the possible experimental evaluation scenarios that can be used to validate continual learning solutions. The model referred to as Unrealistic CL is the most commonly used in the literature, also known as the class-incremental setup. In this work, we propose to explore the RealCL model as a generalization of all continual learning setups. In RealCL, the categories to be recognized by CL models are not distributed in a structured manner across tasks, but rather randomly. Halfway between Unrealistic CL and RealCL, we have the semi-realistic scenario, where it is tolerated for classes to be organized across tasks, but the number classes assigned to each task is not controlled. In this work, we provide an experimental evaluation in the new RealCL scenario of both our novel CLARE proposal and the state-of-the-art CL models.
  • Figure 2: We show the main components of CLARE: the memory module, a frozen pre-trained model and the Dyn-NAN module.
  • Figure 3: Results of CLARE for CIFAR-10 in the three scenarios with memory module size of 1K.
  • Figure 4: Results of CLARE for CIFAR-100 in the three scenarios with memory module size of 2K.
  • Figure 5: Results of CLARE for CIFAR-100 in the three scenarios with memory module size of 4K.
  • ...and 6 more figures