Sparse Orthogonal Variational Inference for Gaussian Processes
Jiaxin Shi, Michalis K. Titsias, Andriy Mnih
TL;DR
This work addresses the scalability-gap in Gaussian processes by reinterpreting sparse variational GP (SVGP) inference as a two-component orthogonal decomposition of the GP prior. By introducing an additional orthogonal inducing-point set, SOLVE-GP provides a structured variational bound that yields tighter marginal likelihood lower bounds without prohibitive cost, effectively allowing more inducing points under a fixed budget. The framework subsumes SVGP as a special case, connects to decoupled inducing-point methods, and extends naturally to inter-domain, convolutional, and deep GP models, achieving state-of-the-art results on CIFAR-10 with purely GP-based models. Overall, SOLVE-GP enhances the expressiveness and scalability of GP posteriors, enabling powerful large-scale and deep GP architectures for real-world applications.
Abstract
We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points, which can lead to more scalable algorithms than previous methods. It is based on decomposing a Gaussian process as a sum of two independent processes: one spanned by a finite basis of inducing points and the other capturing the remaining variation. We show that this formulation recovers existing approximations and at the same time allows to obtain tighter lower bounds on the marginal likelihood and new stochastic variational inference algorithms. We demonstrate the efficiency of these algorithms in several Gaussian process models ranging from standard regression to multi-class classification using (deep) convolutional Gaussian processes and report state-of-the-art results on CIFAR-10 among purely GP-based models.
