Model Inversion with Layer-Specific Modeling and Alignment for Data-Free Continual Learning
Ruilin Tong, Haodong Lu, Yuhang Liu, Dong Gong
TL;DR
The paper tackles data-free continual learning by addressing two key bottlenecks: distribution drift in model inversion and high computational cost on large models. It proposes Per-layer Model Inversion (PMI) to solve the inversion problem layer-by-layer, coupled with Gaussian class-wise feature modeling and a lightweight contrastive model to maintain real-synthetic feature alignment. A semantic-aware feature projection strategy enables generating pseudo-image data for new classes from CLIP’s pre-trained knowledge, allowing incremental learning without accessing prior data. Empirical results across ResNet and CLIP backbones on multiple CL benchmarks show consistent performance gains over existing data-free baselines, improved efficiency, and broad compatibility with various CL strategies. The approach mitigates feature shift, scales to large foundation models, and offers a practical path for data-free continual learning in privacy-sensitive settings.
Abstract
Continual learning (CL) aims to incrementally train a model on a sequence of tasks while retaining performance on prior ones. However, storing and replaying data is often infeasible due to privacy or security constraints and impractical for arbitrary pre-trained models. Data-free CL seeks to update models without access to previous data. Beyond regularization, we employ model inversion to synthesize data from the trained model, enabling replay without storing samples. Yet, model inversion in predictive models faces two challenges: (1) generating inputs solely from compressed output labels causes drift between synthetic and real data, and replaying such data can erode prior knowledge; (2) inversion is computationally expensive since each step backpropagates through the full model. These issues are amplified in large pre-trained models such as CLIP. To improve efficiency, we propose Per-layer Model Inversion (PMI), inspired by faster convergence in single-layer optimization. PMI provides strong initialization for full-model inversion, substantially reducing iterations. To mitigate feature shift, we model class-wise features via Gaussian distributions and contrastive model, ensuring alignment between synthetic and real features. Combining PMI and feature modeling, our approach enables continual learning of new classes by generating pseudo-images from semantic-aware projected features, achieving strong effectiveness and compatibility across multiple CL settings.
