Table of Contents
Fetching ...

Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates

Nick Polson, Vadim Sokolov

TL;DR

Generative Bayesian Computation via Implicit Quantile Networks (IQNs) via Implicit Quantile Networks (IQNs) is proposed as a surrogate framework that targets all three limitations of GP surrogates: cubic cost, stationarity assumptions, and Gaussian predictive distributions.

Abstract

Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach. We propose Generative Bayesian Computation (GBC) via Implicit Quantile Networks (IQNs) as a surrogate framework that targets all three limitations. GBC learns the full conditional quantile function from input--output pairs; at test time, a single forward pass per quantile level produces draws from the predictive distribution. Across fourteen benchmarks we compare GBC to four GP-based methods. GBC improves CRPS by 11--26\% on piecewise jump-process benchmarks, by 14\% on a ten-dimensional Friedman function, and scales linearly to 90,000 training points where dense-covariance GPs are infeasible. A boundary-augmented variant matches or outperforms Modular Jump GPs on two-dimensional jump datasets (up to 46\% CRPS improvement). In active learning, a randomized-prior IQN ensemble achieves nearly three times lower RMSE than deep GP active learning on Rocket LGBB. Overall, GBC records a favorable point estimate in 12 of 14 comparisons. GPs retain an edge on smooth surfaces where their smoothness prior provides effective regularization.

Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates

TL;DR

Generative Bayesian Computation via Implicit Quantile Networks (IQNs) via Implicit Quantile Networks (IQNs) is proposed as a surrogate framework that targets all three limitations of GP surrogates: cubic cost, stationarity assumptions, and Gaussian predictive distributions.

Abstract

Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach. We propose Generative Bayesian Computation (GBC) via Implicit Quantile Networks (IQNs) as a surrogate framework that targets all three limitations. GBC learns the full conditional quantile function from input--output pairs; at test time, a single forward pass per quantile level produces draws from the predictive distribution. Across fourteen benchmarks we compare GBC to four GP-based methods. GBC improves CRPS by 11--26\% on piecewise jump-process benchmarks, by 14\% on a ten-dimensional Friedman function, and scales linearly to 90,000 training points where dense-covariance GPs are infeasible. A boundary-augmented variant matches or outperforms Modular Jump GPs on two-dimensional jump datasets (up to 46\% CRPS improvement). In active learning, a randomized-prior IQN ensemble achieves nearly three times lower RMSE than deep GP active learning on Rocket LGBB. Overall, GBC records a favorable point estimate in 12 of 14 comparisons. GPs retain an edge on smooth surfaces where their smoothness prior provides effective regularization.
Paper Structure (25 sections, 4 theorems, 14 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 25 sections, 4 theorems, 14 equations, 7 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

If $(X, Z)$ are random variables in a Borel space $\mathcal{X} \times \mathcal{Z}$, there exist a random variable $U \perp\!\!\!\perp X$ and a measurable function $G^\star : [0,1] \times \mathcal{X} \to \mathcal{Z}$ such that

Figures (7)

  • Figure 1: GP vs. GBC on a 1D jump function ($n = 100$, noiseless). The stationary GP (left) adapts its length scale near the discontinuity but blurs the jump; GBC (right) widens its quantile bands at the boundary. Details on the IQN architecture are in Section \ref{['sec:gbc']}.
  • Figure 2: IQN architecture. The input $x$ and quantile level $\tau$ are processed through separate branches ($f_x$ and $f_\tau \circ \phi$), merged via elementwise product, and decoded into a location estimate $\hat{\mu}$ and a quantile prediction $\hat{q}_\tau$. The cosine embedding $\phi(\tau)$ enables smooth interpolation over quantile levels.
  • Figure 3: Motorcycle benchmark: stationary GP (top left), hetGP (top right), single IQN (bottom left), and $K{=}5$ IQN ensemble (bottom right). Shaded region: 90% predictive interval. The stationary GP assigns near-constant uncertainty; hetGP adapts via a latent noise process; GBC adapts without a parametric noise model.
  • Figure 4: BGP benchmark ($d=2$): predicted mean surface from stationary GP (left) vs. GBC/IQN (right). Dashed line: true partition boundary $a^\top x = 0$. GP smooths across the jump; GBC captures the regime change.
  • Figure 5: Friedman $d=10$: training time (left) and test RMSE (right) vs. sample size. GP time grows as $\mathcal{O}(n^3)$; GBC grows approximately linearly. GBC achieves lower RMSE at all $n$ and remains feasible as $n$ grows beyond GP's reach.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Theorem 1: Noise Outsourcing; kallenberg1997foundations, Thm. 5.10
  • Definition 2: GBC Framework
  • Proposition 3: Universal approximation of conditional quantiles
  • Proposition 4: CRPS approximation bound
  • Proposition 5: Consistency
  • Remark 6: Theory vs. practice