Iterative Methods for Vecchia-Laplace Approximations for Latent Gaussian Process Models
Pascal Kündig, Fabio Sigrist
TL;DR
This paper addresses the computational bottlenecks of Vecchia-Laplace approximations for latent Gaussian process models with non-Gaussian likelihoods by introducing iterative methods based on preconditioned conjugate gradient and stochastic Lanczos quadrature. It proposes a suite of preconditioners (VADU, LVA, LRAC, ZIRC) and derives convergence guarantees, along with variance-reduction strategies for stochastic gradients. Two paths for predictive covariance computation are developed: simulation-based estimators and Lanczos-based approximations using specialized preconditioners, with theoretical error bounds. Empirical results on simulated and real-world data show substantial speedups (often an order of magnitude) and improved predictive performance relative to Cholesky-based approaches and prior Vecchia methods, all implemented in the GPBoost library; these advances enable scalable, accurate inference for Vecchia-Laplace GP models, though convergence still depends on covariance structure and fixed effects parameters.
Abstract
Latent Gaussian process (GP) models are flexible probabilistic non-parametric function models. Vecchia approximations are accurate approximations for GPs to overcome computational bottlenecks for large data, and the Laplace approximation is a fast method with asymptotic convergence guarantees to approximate marginal likelihoods and posterior predictive distributions for non-Gaussian likelihoods. Unfortunately, the computational complexity of combined Vecchia-Laplace approximations grows faster than linearly in the sample size when used in combination with direct solver methods such as the Cholesky decomposition. Computations with Vecchia-Laplace approximations can thus become prohibitively slow precisely when the approximations are usually the most accurate, i.e., on large data sets. In this article, we present iterative methods to overcome this drawback. Among other things, we introduce and analyze several preconditioners, derive new convergence results, and propose novel methods for accurately approximating predictive variances. We analyze our proposed methods theoretically and in experiments with simulated and real-world data. In particular, we obtain a speed-up of an order of magnitude compared to Cholesky-based calculations and a threefold increase in prediction accuracy in terms of the continuous ranked probability score compared to a state-of-the-art method on a large satellite data set. All methods are implemented in a free C++ software library with high-level Python and R packages.
