Table of Contents
Fetching ...

Vecchia approximated Bayesian heteroskedastic Gaussian processes

Parul V. Patil, Robert B. Gramacy, Cayelan C. Carey, R. Quinn Thomas

TL;DR

A Bayesian hetGP is proposed using elliptical slice sampling (ESS) for posterior variance integration, and the Vecchia approximation to circumvent computational bottlenecks.

Abstract

Many computer simulations are stochastic and exhibit input dependent noise. In such situations, heteroskedastic Gaussian processes (hetGPs) make ideal surrogates as they estimate a latent, non-constant variance. However, existing hetGP implementations are unable to deal with large simulation campaigns and use point-estimates for all unknown quantities, including latent variances. This limits applicability to small experiments and undercuts uncertainty. We propose a Bayesian hetGP using elliptical slice sampling (ESS) for posterior variance integration, and the Vecchia approximation to circumvent computational bottlenecks. We show good performance for our upgraded hetGP capability, compared to alternatives, on a benchmark example and a motivating corpus of more than 9-million lake temperature simulations. An open source implementation is provided as bhetGP on CRAN.

Vecchia approximated Bayesian heteroskedastic Gaussian processes

TL;DR

A Bayesian hetGP is proposed using elliptical slice sampling (ESS) for posterior variance integration, and the Vecchia approximation to circumvent computational bottlenecks.

Abstract

Many computer simulations are stochastic and exhibit input dependent noise. In such situations, heteroskedastic Gaussian processes (hetGPs) make ideal surrogates as they estimate a latent, non-constant variance. However, existing hetGP implementations are unable to deal with large simulation campaigns and use point-estimates for all unknown quantities, including latent variances. This limits applicability to small experiments and undercuts uncertainty. We propose a Bayesian hetGP using elliptical slice sampling (ESS) for posterior variance integration, and the Vecchia approximation to circumvent computational bottlenecks. We show good performance for our upgraded hetGP capability, compared to alternatives, on a benchmark example and a motivating corpus of more than 9-million lake temperature simulations. An open source implementation is provided as bhetGP on CRAN.

Paper Structure

This paper contains 27 sections, 25 equations, 14 figures, 1 algorithm.

Figures (14)

  • Figure 1: NOAA-GLM simulations ahead to a horizon of 30 days, varying day of year (DOY) in 2022 and depth from the surface of the lake in meters.
  • Figure 2: Top: Fits on the motorcycle data via (MLE, homoskedastic) GP, hetGP, and our Bayesian hetGP. Bottom: estimated log-noise, numbers indicate empirical variances based on that many replicates.
  • Figure 3: ESS iterations ($t$) in 1D. Black lines indicate the initial value, dashed lines are rejected proposals and solid lines are the final accepted ones. The last panel shows all ESS samples after burn-in and thinning.
  • Figure 4: Left: exploring statistical with $\Lambda_N$ sampling ( black points) versus $\Lambda_n$ ( red triangles). Right: exploring computational efficiency for increasing experiment sizes, $N = 50n$ over 1000 MCMC iterations.
  • Figure 5: Left: Vecchia approximation showing chosen NN without ( top) and with ( bottom) Woodbury likelihood; middle: posterior surfaces and MAP estimate; right computation time from fifty reps.
  • ...and 9 more figures