Elastic Feature Consolidation for Cold Start Exemplar-Free Incremental Learning
Simone Magistri, Tomaso Trinci, Albin Soutif-Cormerais, Joost van de Weijer, Andrew D. Bagdanov
TL;DR
Elastic Feature Consolidation addresses Cold Start Exemplar-Free Class Incremental Learning by introducing a non-isotropic, second-order regularization in feature space through the Empirical Feature Matrix ($E_t$), which concentrates regularization on directions most impactful to previous tasks. It couples this with an asymmetric Prototype Rehearsal loss (PR-ACE) and drift-aware prototype updates to preserve backbone plasticity while mitigating forgetting, using Gaussian prototypes to balance learning across seen classes. The key contributions are the analytic formulation of the Empirical Feature Matrix, its use as a feature-space pseudo-metric for regularization, and the integration of EFM-guided prototype drift compensation within an asymmetric rehearsal framework. Empirical results on CIFAR-100, Tiny-ImageNet, and ImageNet-Subset show that EFC outperforms state-of-the-art methods in both Warm Start and especially Cold Start scenarios, achieving stronger plasticity with competitive or reduced storage costs. This approach provides a privacy-preserving, exemplar-free pathway to robust continual learning with practical implications for real-world sequence learning under data constraints.
Abstract
Exemplar-Free Class Incremental Learning (EFCIL) aims to learn from a sequence of tasks without having access to previous task data. In this paper, we consider the challenging Cold Start scenario in which insufficient data is available in the first task to learn a high-quality backbone. This is especially challenging for EFCIL since it requires high plasticity, which results in feature drift which is difficult to compensate for in the exemplar-free setting. To address this problem, we propose a simple and effective approach that consolidates feature representations by regularizing drift in directions highly relevant to previous tasks and employs prototypes to reduce task-recency bias. Our method, called Elastic Feature Consolidation (EFC), exploits a tractable second-order approximation of feature drift based on an Empirical Feature Matrix (EFM). The EFM induces a pseudo-metric in feature space which we use to regularize feature drift in important directions and to update Gaussian prototypes used in a novel asymmetric cross entropy loss which effectively balances prototype rehearsal with data from new tasks. Experimental results on CIFAR-100, Tiny-ImageNet, ImageNet-Subset and ImageNet-1K demonstrate that Elastic Feature Consolidation is better able to learn new tasks by maintaining model plasticity and significantly outperform the state-of-the-art.
