Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes
Tim Gyger, Reinhard Furrer, Fabio Sigrist
TL;DR
This work addresses the scalability gap in Gaussian process inference by introducing Vecchia-inducing-points full-scale (VIF) approximations, which blend global inducing points with local Vecchia residuals to capture both large- and small-scale dependencies. It develops a correlation-based neighbor search via a modified cover tree and extends to non-Gaussian likelihoods using Laplace approximations, supported by iterative solvers and sophisticated preconditioners (VIFDU and FITC) to dramatically reduce compute relative to Cholesky. The authors provide convergence guarantees for the preconditioned CG method, propose efficient predictive-variance estimators, and validate the approach with extensive simulations and real-world datasets, showing superior accuracy and competitive runtimes versus state-of-the-art GP methods. The framework is implemented in GPBoost with Python/R interfaces, enabling practical deployment for large-scale GP problems across domains. Overall, VIF offers a robust, scalable, and accurate pathway for GP inference in both Gaussian and non-Gaussian settings with principled neighbor selection and iterative inference.
Abstract
Gaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we propose Vecchia-inducing-points full-scale (VIF) approximations combining the strengths of global inducing points and local Vecchia approximations. Vecchia approximations excel in settings with low-dimensional inputs and moderately smooth covariance functions, while inducing point methods are better suited to high-dimensional inputs and smoother covariance functions. Our VIF approach bridges these two regimes by using an efficient correlation-based neighbor-finding strategy for the Vecchia approximation of the residual process, implemented via a modified cover tree algorithm. We further extend our framework to non-Gaussian likelihoods by introducing iterative methods that substantially reduce computational costs for training and prediction by several orders of magnitudes compared to Cholesky-based computations when using a Laplace approximation. In particular, we propose and compare novel preconditioners and provide theoretical convergence results. Extensive numerical experiments on simulated and real-world data sets show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives. All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.
