Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond
Jichuan Zhang, Yali Li, Xin Liu, Shengjin Wang
TL;DR
This work tackles non-exemplar class-incremental learning by addressing the representation shift and memory constraints that arise when old exemplars are unavailable. It introduces DiffFR, a three-component approach that combines similarity-driven self-supervision for a generalizable feature extractor, a compact diffusion model to generate class-conditioned old-feature replay, and prototype calibration to focus diffusion on the shape of feature distributions rather than exact prototypes, while keeping the feature extractor frozen. Empirical results on CIFAR-100, TinyImageNet, and ImageNet-Subset show DiffFR achieving state-of-the-art average incremental accuracy gains (e.g., improvements around $2$–$4$ percentage points) and reduced forgetting, with additional gains under enhanced data augmentation and in domain-incremental settings. The method is memory-efficient and exemplar-free, and demonstrates strong generalization to new classes and domain shifts, signaling practical impact for continual learning in privacy-conscious or data-limited scenarios, with code to be released.
Abstract
Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples. Prior methodologies generally employ simple rules to generate features for replaying, suffering from large distribution gap between replayed features and real ones. To address the aforementioned issue, we propose a simple, yet effective \textbf{Diff}usion-based \textbf{F}eature \textbf{R}eplay (\textbf{DiffFR}) method for NECIL. First, to alleviate the limited representational capacity caused by fixing the feature extractor, we employ Siamese-based self-supervised learning for initial generalizable features. Second, we devise diffusion models to generate class-representative features highly similar to real features, which provides an effective way for exemplar-free knowledge memorization. Third, we introduce prototype calibration to direct the diffusion model's focus towards learning the distribution shapes of features, rather than the entire distribution. Extensive experiments on public datasets demonstrate significant performance gains of our DiffFR, outperforming the state-of-the-art NECIL methods by 3.0\% in average. The code will be made publicly available soon.
