Energy-Based Models for Continual Learning
Shuang Li, Yilun Du, Gido M. van de Ven, Igor Mordatch
TL;DR
This work reframes continual learning as energy-based modeling for classification, introducing an unnormalized distribution p(y|x) defined by an energy E(x,y) and employing a simple contrastive divergence objective to update ground-truth versus a negative class. The approach enables selective updates that mitigate forgetting without relying on replay or explicit task boundaries, and it naturally extends to boundary-free data streams. Empirical results on boundary-aware and boundary-free benchmarks (split MNIST, permuted MNIST, CIFAR-10, CIFAR-100) show EBMs outperform softmax baselines and many CL methods, with competitive gains when combined with replay. Overall, EBMs provide a flexible, scalable building block for continual learning that can adapt to diverse task structures and data distributions, with potential for further integration and architectural enhancements.
Abstract
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperforms baseline methods by a large margin on several benchmarks. Moreover, our proposed contrastive divergence-based training objective can be combined with other continual learning methods, resulting in substantial boosts in their performance. We further show that EBMs are adaptable to a more general continual learning setting where the data distribution changes without the notion of explicitly delineated tasks. These observations point towards EBMs as a useful building block for future continual learning methods.
