Mathematics of Continual Learning
Liangzu Peng, René Vidal
TL;DR
This paper establishes a principled bridge between continual learning and adaptive filtering by mapping classic adaptive-filtering algorithms—LMS, APA, RLS, and Kalman Filter—onto continual-learning scenarios. It shows that LMS can be viewed as an online, memoryless learner with exponential convergence on linear tasks, while APA integrates past data via constraint projections; RLS and KF generalize these ideas to weighted past data and state-space task relationships, with RTS smoothing enabling positive backward transfer. The authors further connect these methods to ideal continual learning (ICL) and gradient-projection approaches, and extend the insights to layer-wise and linearized nonlinear models, providing a cohesive mathematical foundation for understanding forgetting, task relationships, and continual adaptation. Overall, the work suggests a rigorous, finitary basis for designing continual-learning algorithms, including memory-based, regularization-based, and expansion-based strategies, and points to rich future directions in nonlinear extensions and kernel- or subspace-tracking analogies.
Abstract
Continual learning is an emerging subject in machine learning that aims to solve multiple tasks presented sequentially to the learner without forgetting previously learned tasks. Recently, many deep learning based approaches have been proposed for continual learning, however the mathematical foundations behind existing continual learning methods remain underdeveloped. On the other hand, adaptive filtering is a classic subject in signal processing with a rich history of mathematically principled methods. However, its role in understanding the foundations of continual learning has been underappreciated. In this tutorial, we review the basic principles behind both continual learning and adaptive filtering, and present a comparative analysis that highlights multiple connections between them. These connections allow us to enhance the mathematical foundations of continual learning based on existing results for adaptive filtering, extend adaptive filtering insights using existing continual learning methods, and discuss a few research directions for continual learning suggested by the historical developments in adaptive filtering.
