Table of Contents
Fetching ...

Efficient Fourier representations of families of Gaussian processes

Philip Greengard

TL;DR

Addresses the computational bottleneck of Gaussian process regression for translation-invariant kernels by introducing a Fourier representation that remains valid over ranges of hyperparameters. A weight-space expansion $f(x) = \sum_{i=1}^{m} \alpha_i \gamma_i \cos(2\pi \xi_i x) + \beta_i \gamma_i \sin(2\pi \xi_i x)$ is constructed with frequencies determined by generalized Gaussian quadratures, yielding a kernel approximation $k'(x-y)$ with controllable accuracy. After a one-time precomputation costing $O(N + m^2 \log m)$, GP regression and determinant calculations across all hyperparameters require $O(m^3)$ time each, thanks to the non-uniform FFT for forming $X^T X$ and related matrices. Numerical experiments on Matérn and squared-exponential kernels in 1D demonstrate accurate kernel approximation and scalable inference, with natural pathways to higher dimensions via tensor-product expansions, albeit subject to the curse of dimensionality.

Abstract

We introduce a class of algorithms for constructing Fourier representations of Gaussian processes in $1$ dimension that are valid over ranges of hyperparameter values. The scaling and frequencies of the Fourier basis functions are evaluated numerically via generalized quadratures. The representations introduced allow for $O(m^3)$ inference, independent of $N$, for all hyperparameters in the user-specified range after $O(N + m^2\log{m})$ precomputation where $N$, the number of data points, is usually significantly larger than $m$, the number of basis functions. Inference independent of $N$ for various hyperparameters is facilitated by generalized quadratures, and the $O(N + m^2\log{m})$ precomputation is achieved with the non-uniform FFT. Numerical results are provided for Matérn kernels with $ν\in [3/2, 7/2]$ and lengthscale $ρ\in [0.1, 0.5]$ and squared-exponential kernels with lengthscale $ρ\in [0.1, 0.5]$. The algorithms of this paper generalize mathematically to higher dimensions, though they suffer from the standard curse of dimensionality.

Efficient Fourier representations of families of Gaussian processes

TL;DR

Addresses the computational bottleneck of Gaussian process regression for translation-invariant kernels by introducing a Fourier representation that remains valid over ranges of hyperparameters. A weight-space expansion is constructed with frequencies determined by generalized Gaussian quadratures, yielding a kernel approximation with controllable accuracy. After a one-time precomputation costing , GP regression and determinant calculations across all hyperparameters require time each, thanks to the non-uniform FFT for forming and related matrices. Numerical experiments on Matérn and squared-exponential kernels in 1D demonstrate accurate kernel approximation and scalable inference, with natural pathways to higher dimensions via tensor-product expansions, albeit subject to the curse of dimensionality.

Abstract

We introduce a class of algorithms for constructing Fourier representations of Gaussian processes in dimension that are valid over ranges of hyperparameter values. The scaling and frequencies of the Fourier basis functions are evaluated numerically via generalized quadratures. The representations introduced allow for inference, independent of , for all hyperparameters in the user-specified range after precomputation where , the number of data points, is usually significantly larger than , the number of basis functions. Inference independent of for various hyperparameters is facilitated by generalized quadratures, and the precomputation is achieved with the non-uniform FFT. Numerical results are provided for Matérn kernels with and lengthscale and squared-exponential kernels with lengthscale . The algorithms of this paper generalize mathematically to higher dimensions, though they suffer from the standard curse of dimensionality.

Paper Structure

This paper contains 10 sections, 1 theorem, 51 equations, 4 figures, 6 tables, 2 algorithms.

Key Result

Theorem 1

Let $f$ be the random expansion defined by where for all $i,j = 1,...,m$ are iid and $\gamma_i$ are defined by for some $\xi_i, w_i > 0$. Then $f$ is a Gaussian process distribution with covariance kernel $k'$ defined by the formula

Figures (4)

  • Figure 1: Location of the $86$ nodes for GPs defined on $[-1, 1]$ with Matérn kernels with $\nu \in [1.5, 3.5]$, and $\rho \in [0.1, 0.5]$.
  • Figure 2: Location of the nodes for GPs defined on $[-1, 1]$ with squared-exponential kernels with $\rho \in [0.1, 0.5]$. Both sets of nodes were generated with Algorithm \ref{['a10']} with different error tolerance $\epsilon$.
  • Figure 3: $\log_{10} \, \, L^2$ errors in the effective covariance kernel for Matérn kernels. Specifically, we plot $\log_{10} \| k_{\nu, \rho} - k_{\nu, \rho}'\|_2$ for various $\nu, \rho$, where $k'_{\nu, \rho}$ denotes the effective kernel and $k_{\nu, \rho}$ the exact kernel.
  • Figure 4: Scaling times for evaluation of conditional mean for varying amounts of data with Matérn kernel. We include a plot proportional to $N$ for comparison.

Theorems & Definitions (2)

  • Theorem 1
  • proof