Continual learning via probabilistic exchangeable sequence modelling
Hanwen Xing, Christopher Yau
TL;DR
CL-BRUNO tackles continual learning by casting streams of task data as exchangeable sequences and modelling them with a probabilistic Neural Process built on a conditional Real-NVP bijection $f_ heta$. It combines a C-BRUNO module for modelling $p(oldsymbol{X}_t|t,oldsymbol{Y}_t)$ with a likelihood-based update regime and generative replay that yields both distributional and functional regularisation, removing the need to store past samples. The framework supports both task- and class-incremental learning, offering tractable uncertainty quantification for label and task identity and enabling outlier detection. Empirical results on CIFAR-100 and MNIST (plus biomedical datasets in supplement) show competitive or superior performance to state-of-the-art exemplar-free and generative CL methods, with favorable memory and computation characteristics, making it practical for privacy-conscious, real-world deployment.
Abstract
Continual learning (CL) refers to the ability to continuously learn and accumulate new knowledge while retaining useful information from past experiences. Although numerous CL methods have been proposed in recent years, it is not straightforward to deploy them directly to real-world decision-making problems due to their computational cost and lack of uncertainty quantification. To address these issues, we propose CL-BRUNO, a probabilistic, Neural Process-based CL model that performs scalable and tractable Bayesian update and prediction. Our proposed approach uses deep-generative models to create a unified probabilistic framework capable of handling different types of CL problems such as task- and class-incremental learning, allowing users to integrate information across different CL scenarios using a single model. Our approach is able to prevent catastrophic forgetting through distributional and functional regularisation without the need of retaining any previously seen samples, making it appealing to applications where data privacy or storage capacity is of concern. Experiments show that CL-BRUNO outperforms existing methods on both natural image and biomedical data sets, confirming its effectiveness in real-world applications.
