Table of Contents
Fetching ...

Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond

Jichuan Zhang, Yali Li, Xin Liu, Shengjin Wang

TL;DR

This work tackles non-exemplar class-incremental learning by addressing the representation shift and memory constraints that arise when old exemplars are unavailable. It introduces DiffFR, a three-component approach that combines similarity-driven self-supervision for a generalizable feature extractor, a compact diffusion model to generate class-conditioned old-feature replay, and prototype calibration to focus diffusion on the shape of feature distributions rather than exact prototypes, while keeping the feature extractor frozen. Empirical results on CIFAR-100, TinyImageNet, and ImageNet-Subset show DiffFR achieving state-of-the-art average incremental accuracy gains (e.g., improvements around $2$–$4$ percentage points) and reduced forgetting, with additional gains under enhanced data augmentation and in domain-incremental settings. The method is memory-efficient and exemplar-free, and demonstrates strong generalization to new classes and domain shifts, signaling practical impact for continual learning in privacy-conscious or data-limited scenarios, with code to be released.

Abstract

Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples. Prior methodologies generally employ simple rules to generate features for replaying, suffering from large distribution gap between replayed features and real ones. To address the aforementioned issue, we propose a simple, yet effective \textbf{Diff}usion-based \textbf{F}eature \textbf{R}eplay (\textbf{DiffFR}) method for NECIL. First, to alleviate the limited representational capacity caused by fixing the feature extractor, we employ Siamese-based self-supervised learning for initial generalizable features. Second, we devise diffusion models to generate class-representative features highly similar to real features, which provides an effective way for exemplar-free knowledge memorization. Third, we introduce prototype calibration to direct the diffusion model's focus towards learning the distribution shapes of features, rather than the entire distribution. Extensive experiments on public datasets demonstrate significant performance gains of our DiffFR, outperforming the state-of-the-art NECIL methods by 3.0\% in average. The code will be made publicly available soon.

Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond

TL;DR

This work tackles non-exemplar class-incremental learning by addressing the representation shift and memory constraints that arise when old exemplars are unavailable. It introduces DiffFR, a three-component approach that combines similarity-driven self-supervision for a generalizable feature extractor, a compact diffusion model to generate class-conditioned old-feature replay, and prototype calibration to focus diffusion on the shape of feature distributions rather than exact prototypes, while keeping the feature extractor frozen. Empirical results on CIFAR-100, TinyImageNet, and ImageNet-Subset show DiffFR achieving state-of-the-art average incremental accuracy gains (e.g., improvements around percentage points) and reduced forgetting, with additional gains under enhanced data augmentation and in domain-incremental settings. The method is memory-efficient and exemplar-free, and demonstrates strong generalization to new classes and domain shifts, signaling practical impact for continual learning in privacy-conscious or data-limited scenarios, with code to be released.

Abstract

Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples. Prior methodologies generally employ simple rules to generate features for replaying, suffering from large distribution gap between replayed features and real ones. To address the aforementioned issue, we propose a simple, yet effective \textbf{Diff}usion-based \textbf{F}eature \textbf{R}eplay (\textbf{DiffFR}) method for NECIL. First, to alleviate the limited representational capacity caused by fixing the feature extractor, we employ Siamese-based self-supervised learning for initial generalizable features. Second, we devise diffusion models to generate class-representative features highly similar to real features, which provides an effective way for exemplar-free knowledge memorization. Third, we introduce prototype calibration to direct the diffusion model's focus towards learning the distribution shapes of features, rather than the entire distribution. Extensive experiments on public datasets demonstrate significant performance gains of our DiffFR, outperforming the state-of-the-art NECIL methods by 3.0\% in average. The code will be made publicly available soon.
Paper Structure (25 sections, 10 equations, 8 figures, 9 tables)

This paper contains 25 sections, 10 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Comparisons of three feature replay ways. a) and b) represent methods exemplified by PASS and FeTrIL, respectively. c) represent our method (DiffFR). Compared to a) and b), we replay features with high similarity to real features, thus improving the performance of NECIL. Besides,our approach does not suffer from representation shift or the inability to generalize.
  • Figure 2: Illustration of DiffFR for NECIL. Classes of the initial task are augmented by rotation transformation, thus resulting the difference between the classifier in the initial training and that in the incremental training.
  • Figure 3: Illustration of prototype calibration. To learn the distribution of real features, we first normalize them by class. Subsequently, we use the diffusion model to learn the distribution of normalized features. Finally, denormalization by class is applied to samples to generate features with accurate prototypes.
  • Figure 4: Evolution of top-1 accuracy on three datasets with T = 10 phases.
  • Figure 5: (a) Average accuracy of total/old/new classes at each phase. For better visualization, we use the setting of CIFAR-100,$T=5$ as an example. (b) Visualization of real and generated features on CIFAR-100 (10 phases). Each class has 500 real features, and we generate 5000 features to facilitate more accurate comparison between the distribution shape of real features and generated features.
  • ...and 3 more figures