Class-Incremental Learning: A Survey

Da-Wei Zhou; Qi-Wei Wang; Zhi-Hong Qi; Han-Jia Ye; De-Chuan Zhan; Ziwei Liu

Class-Incremental Learning: A Survey

Da-Wei Zhou, Qi-Wei Wang, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

TL;DR

This survey addresses catastrophic forgetting in Class-Incremental Learning by organizing methods into seven taxonomic categories, formalizing problem statements, and benchmarking 17 approaches on CIFAR100 and ImageNet under memory-budget–aligned protocols. It reveals strong baselines in exemplar replay and knowledge distillation, while dynamic networks shine when memory budgets are generous, and pre-trained models offer substantial gains with caveats about fairness. A memory-agnostic AUC framework is proposed to compare methods across budgets, uncovering regime-specific strengths and guiding practical deployment. The paper also outlines future directions, including exemplar-free settings, online learning, multi-modal inputs, and forward-looking compatibility with pre-trained foundation models.

Abstract

Deep models, e.g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world. However, novel classes emerge from time to time in our ever-changing world, requiring a learning system to acquire new knowledge continually. Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally and build a universal classifier among all seen classes. Correspondingly, when directly training the model with new class instances, a fatal problem occurs -- the model tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades. There have been numerous efforts to tackle catastrophic forgetting in the machine learning community. In this paper, we survey comprehensively recent advances in class-incremental learning and summarize these methods from several aspects. We also provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms empirically. Furthermore, we notice that the current comparison protocol ignores the influence of memory budget in model storage, which may result in unfair comparison and biased results. Hence, we advocate fair comparison by aligning the memory budget in evaluation, as well as several memory-agnostic performance measures. The source code is available at https://github.com/zhoudw-zdw/CIL_Survey/

Class-Incremental Learning: A Survey

TL;DR

Abstract

Paper Structure (46 sections, 24 equations, 30 figures, 29 tables)

This paper contains 46 sections, 24 equations, 30 figures, 29 tables.

Introduction
Preliminaries
Problem Formulation
Exemplars and Exemplar Set
Class-Incremental Learning: Taxonomy
Data Replay
Data Regularization
Dynamic Networks
Neuron Expansion
Backbone Expansion
Prompt Expansion
Parameter Regularization
Knowledge Distillation
Model Rectify
Template-Based Classification
...and 31 more sections

Figures (30)

Figure 1: The setting of CIL. Non-overlapping classes arrive sequentially, and the model needs to learn to classify all the classes incrementally. After learning each task, the model is evaluated among all seen classes. An ideal model should perform well in the newly learned classes and remember the former without forgetting.
Figure 2: The setting of Class-Incremental Learning (CIL), Task-Incremental Learning (TIL), and Domain-Incremental Learning (DIL). CIL and TIL share the same training protocol, while TIL is much easier during inference, i.e., only requiring classifying among corresponding label spaces. DIL refers to the data stream with distribution change, where new tasks contain the same classes from different domains, e.g., cartoon and clip-art. The distinction of these scenarios is proposed by van2022three.
Figure 3: The roadmap of class-incremental learning. We organize representative methods chronologically to show the concentration at different stages. Different colors of these methods denote the sub-categories in Table \ref{['table:taxonomy']}. Knowledge distillation and data replay dominated the research before 2021, while model rectify and dynamic networks became popular after 2021.
Figure 4: Illustration of network structure evolving in backbone expansion. Left: DER expands a new backbone per incremental task. Middle: FOSTER adds an extra model compression stage, which maintains limited model storage. Right: MEMO decouples the network structure and only expands specialized blocks.
Figure 5: Illustration of knowledge distillation in CIL. Left: Logit distillation aligns the model outputs to make the old and new models share the same semantic relationship. Middle: Feature distillation aligns the features produced by the old and new models to ensure the new model does not forget old features. Right: Relational distillation resorts to structural inputs, e.g., triples, and aligns the input relationship of the old and new model.
...and 25 more figures

Theorems & Definitions (2)

Definition 1
Definition 2

Class-Incremental Learning: A Survey

TL;DR

Abstract

Class-Incremental Learning: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (30)

Theorems & Definitions (2)