DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning
Haiyang Guo, Fei Zhu, Fanhu Zeng, Bing Liu, Xu-Yao Zhang
TL;DR
DESIRE tackles rehearsal-free class-incremental learning by decoupling per-task training and applying two efficient post-processing modules: dynamic representation consolidation via continual merging of two LoRA parameter sets using a feature-space attribution loss, and decision boundary refinement through pseudo-feature rebalancing. The approach leverages Gaussian-statistical class representations (means $\boldsymbol{\mu}_i$ and covariances $\boldsymbol{\Sigma}_i$) to guide merging and calibrate classifiers without storing all past task parameters. Empirically, DESIRE achieves state-of-the-art performance among rehearsal-free methods and competitive results with rehearsal-based methods on CIFAR100, TinyImageNet, and ImageNet380 across 5/10/20-task settings, while maintaining efficiency. The work contributes a scalable, plug-in merging paradigm and distribution-based classifier calibration that improve stability-plasticity balance under continual learning with minimal data leakage.
Abstract
Continual learning aims to equip models with the ability to retain previously learned knowledge like a human. Recent work incorporating Parameter-Efficient Fine-Tuning has revitalized the field by introducing lightweight extension modules. However, existing methods usually overlook the issue of information leakage caused by the fact that the experiment data have been used in pre-trained models. Once these duplicate data are removed in the pre-training phase, their performance can be severely affected. In this paper, we propose a new LoRA-based rehearsal-free method named DESIRE. Our method avoids imposing additional constraints during training to mitigate catastrophic forgetting, thereby maximizing the learning of new classes. To integrate knowledge from old and new tasks, we propose two efficient post-processing modules. On the one hand, we retain only two sets of LoRA parameters for merging and propose dynamic representation consolidation to calibrate the merged feature representation. On the other hand, we propose decision boundary refinement to address classifier bias when training solely on new class data. Extensive experiments demonstrate that our method achieves state-of-the-art performance on multiple datasets and strikes an effective balance between stability and plasticity. Our code will be publicly available.
