Gaussian process regression with log-linear scaling for common non-stationary kernels
P. Michael Kielstra, Michael Lindsey
TL;DR
The paper develops a fast, kernel-matrix-vector multiplication framework for Gaussian process regression with non-stationary kernels in low dimensions, extending the equispaced Fourier GP approach to spatially varying scales via the Schoenberg representation. By decomposing the kernel into a sum of Gaussian convolutions and employing Chebyshev interpolation in σ together with NUFFT-based Fourier discretization, it achieves near-linear matvec cost in N with controllable exponential convergence in tunable parameters. The authors provide a rigorous error analysis, showing overall error ||𝒦−˜𝒦|| decays as a combination of discretization and interpolation errors, and demonstrate the approach with numerical experiments that exhibit favorable scaling against state-of-the-art rank-structured methods. The method enables efficient CG-based solves for GPR with non-stationary Matérn kernels and delivers practical improvements in scalability for multi-dimensional problems, potentially benefiting applications requiring flexible, non-stationary kernel design.
Abstract
We introduce a fast algorithm for Gaussian process regression in low dimensions, applicable to a widely-used family of non-stationary kernels. The non-stationarity of these kernels is induced by arbitrary spatially-varying vertical and horizontal scales. In particular, any stationary kernel can be accommodated as a special case, and we focus especially on the generalization of the standard Matérn kernel. Our subroutine for kernel matrix-vector multiplications scales almost optimally as $O(N\log N)$, where $N$ is the number of regression points. Like the recently developed equispaced Fourier Gaussian process (EFGP) methodology, which is applicable only to stationary kernels, our approach exploits non-uniform fast Fourier transforms (NUFFTs). We offer a complete analysis controlling the approximation error of our method, and we validate the method's practical performance with numerical experiments. In particular we demonstrate improved scalability compared to to state-of-the-art rank-structured approaches in spatial dimension $d>1$.
