Exemplar-free Continual Representation Learning via Learnable Drift Compensation
Alex Gomez-Villa, Dipam Goswami, Kai Wang, Andrew D. Bagdanov, Bartlomiej Twardowski, Joost van de Weijer
TL;DR
This work tackles exemplar-free continual representation learning from scratch, where class prototypes drift as the backbone is updated, causing forgetting. It shows that much forgetting stems from prototype drift rather than loss of discriminative power and introduces Learnable Drift Compensation (LDC), a forward projector $p_F^{t}$ that maps old features to the new space, updating old prototypes via $P_t^c = p_F^{t}(P_{t-1}^c)$ without needing old data or labels. LDC is modular and compatible with supervised and self-supervised CL, enabling the first exemplar-free semi-supervised continual learning approach, and it achieves state-of-the-art results across CIFAR-100, Tiny-ImageNet, ImageNet100, and Stanford Cars with ViT variants. Empirical results, ablations, and comparisons against drift-correction baselines demonstrate that LDC effectively tracks prototype positions under moving backbones, offering a memory-efficient, plug-and-play solution with broad practical impact for continual learning tasks.
Abstract
Exemplar-free class-incremental learning using a backbone trained from scratch and starting from a small first task presents a significant challenge for continual representation learning. Prototype-based approaches, when continually updated, face the critical issue of semantic drift due to which the old class prototypes drift to different positions in the new feature space. Through an analysis of prototype-based continual learning, we show that forgetting is not due to diminished discriminative power of the feature extractor, and can potentially be corrected by drift compensation. To address this, we propose Learnable Drift Compensation (LDC), which can effectively mitigate drift in any moving backbone, whether supervised or unsupervised. LDC is fast and straightforward to integrate on top of existing continual learning approaches. Furthermore, we showcase how LDC can be applied in combination with self-supervised CL methods, resulting in the first exemplar-free semi-supervised continual learning approach. We achieve state-of-the-art performance in both supervised and semi-supervised settings across multiple datasets. Code is available at \url{https://github.com/alviur/ldc}.
