Frugal Incremental Generative Modeling using Variational Autoencoders
Victor Enescu, Hichem Sahbi
TL;DR
The paper tackles catastrophic forgetting in continual learning by proposing a replay-free incremental approach based on a conditional Variational Autoencoder. It introduces multi-Gaussian latent priors learned via a fixed-point iteration and enforces orthogonality through null-space gradient projections to prevent forgetting while keeping memory usage frugal. The method leverages CLIP-based prompting to adapt a classifier using synthetic data generated by the decoder, achieving competitive or state-of-the-art results on several incremental-learning benchmarks with dramatically reduced memory costs. A dynamic architecture further mitigates a potential dimensionality bottleneck by selectively expanding only a small portion of parameters per task, enabling scalable continual learning. Overall, the approach combines probabilistic generative modeling, explicit task conditioning, and gradient projection to deliver memory-efficient, high-accuracy continual learning.
Abstract
Continual or incremental learning holds tremendous potential in deep learning with different challenges including catastrophic forgetting. The advent of powerful foundation and generative models has propelled this paradigm even further, making it one of the most viable solution to train these models. However, one of the persisting issues lies in the increasing volume of data particularly with replay-based methods. This growth introduces challenges with scalability since continuously expanding data becomes increasingly demanding as the number of tasks grows. In this paper, we attenuate this issue by devising a novel replay-free incremental learning model based on Variational Autoencoders (VAEs). The main contribution of this work includes (i) a novel incremental generative modelling, built upon a well designed multi-modal latent space, and also (ii) an orthogonality criterion that mitigates catastrophic forgetting of the learned VAEs. The proposed method considers two variants of these VAEs: static and dynamic with no (or at most a controlled) growth in the number of parameters. Extensive experiments show that our method is (at least) an order of magnitude more ``memory-frugal'' compared to the closely related works while achieving SOTA accuracy scores.
