Function-space Parameterization of Neural Networks for Sequential Learning
Aidan Scannell, Riccardo Mereu, Paul Chang, Ella Tamir, Joni Pajarinen, Arno Solin
TL;DR
This work introduces Sparse Function-space Representation (sfr), a dual-parameterized GP derived from a trained neural network to operate in function space, addressing sequential learning challenges such as forgetting and data integration without retraining. By linearizing the NN around MAP weights and using a dual GP parameterization with inducing points, sfr achieves scalable uncertainty quantification while preserving information from all data. It enables continual learning through a function-space regularizer and supports fast incorporation of new data via dual updates, with demonstrated benefits in supervised learning, out-of-distribution detection, and model-based RL. Overall, sfr blends the scalability of neural nets with GP-style uncertainty, offering a practical approach for large-scale sequential learning and decision-making tasks. The approach is applicable to image-rich data and millions of points, where traditional GPs struggle, and it provides a principled, flexible mechanism for uncertainty-guided exploration and robust continual learning.
Abstract
Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge. While Gaussian processes elegantly tackle these problems, they struggle with scalability and handling rich inputs, such as images. To address these issues, we introduce a technique that converts neural networks from weight space to function space, through a dual parameterization. Our parameterization offers: (i) a way to scale function-space methods to large data sets via sparsification, (ii) retention of prior knowledge when access to past data is limited, and (iii) a mechanism to incorporate new data without retraining. Our experiments demonstrate that we can retain knowledge in continual learning and incorporate new data efficiently. We further show its strengths in uncertainty quantification and guiding exploration in model-based RL. Further information and code is available on the project website.
