Table of Contents
Fetching ...

Multi-fidelity No-U-Turn Sampling

Kislaya Ravi, Tobias Neckel, Hans-Joachim Bungartz

TL;DR

The paper addresses the high computational cost of gradient-based MCMC for expensive models by introducing MFNUTS, a framework that uses a multi-fidelity Gaussian Process surrogate to approximate derivatives and guide No-U-Turn sampling. It combines a non-linear multi-fidelity GP construction (including NARGP and derivative fusion variants) with a Delayed Acceptance mechanism to maintain ergodicity with respect to the high-fidelity density. The offline surrogate is trained on a small set of high-/low-fidelity evaluations and the step size is tuned via dual averaging on the surrogate, while the online phase performs sampling with acceptance that references the high-fidelity target. Numerical results on Rosenbrock, an 8-d correlated Gaussian, and a groundwater flow inverse problem show that MFNUTS achieves higher sampling efficiency (mESS) per high-fidelity evaluation compared to MH, HMC, NUTS, and DRAM, demonstrating substantial cost savings without compromising posterior accuracy. The work highlights the value of surrogate-driven gradient proposals in expensive Bayesian inference and points to future extensions with alternative surrogates and augmented rejection schemes.

Abstract

Markov Chain Monte Carlo (MCMC) methods often take many iterations to converge for highly correlated or high-dimensional target density functions. Methods such as Hamiltonian Monte Carlo (HMC) or No-U-Turn Sampling (NUTS) use the first-order derivative of the density function to tackle the aforementioned issues. However, the calculation of the derivative represents a bottleneck for computationally expensive models. We propose to first build a multi-fidelity Gaussian Process (GP) surrogate. The building block of the multi-fidelity surrogate is a hierarchy of models of decreasing approximation error and increasing computational cost. Then the generated multi-fidelity surrogate is used to approximate the derivative. The majority of the computation is assigned to the cheap models thereby reducing the overall computational cost. The derivative of the multi-fidelity method is used to explore the target density function and generate proposals. We select or reject the proposals using the Metropolis Hasting criterion using the highest fidelity model which ensures that the proposed method is ergodic with respect to the highest fidelity density function. We apply the proposed method to three test cases including some well-known benchmarks to compare it with existing methods and show that multi-fidelity No-U-turn sampling outperforms other methods.

Multi-fidelity No-U-Turn Sampling

TL;DR

The paper addresses the high computational cost of gradient-based MCMC for expensive models by introducing MFNUTS, a framework that uses a multi-fidelity Gaussian Process surrogate to approximate derivatives and guide No-U-Turn sampling. It combines a non-linear multi-fidelity GP construction (including NARGP and derivative fusion variants) with a Delayed Acceptance mechanism to maintain ergodicity with respect to the high-fidelity density. The offline surrogate is trained on a small set of high-/low-fidelity evaluations and the step size is tuned via dual averaging on the surrogate, while the online phase performs sampling with acceptance that references the high-fidelity target. Numerical results on Rosenbrock, an 8-d correlated Gaussian, and a groundwater flow inverse problem show that MFNUTS achieves higher sampling efficiency (mESS) per high-fidelity evaluation compared to MH, HMC, NUTS, and DRAM, demonstrating substantial cost savings without compromising posterior accuracy. The work highlights the value of surrogate-driven gradient proposals in expensive Bayesian inference and points to future extensions with alternative surrogates and augmented rejection schemes.

Abstract

Markov Chain Monte Carlo (MCMC) methods often take many iterations to converge for highly correlated or high-dimensional target density functions. Methods such as Hamiltonian Monte Carlo (HMC) or No-U-Turn Sampling (NUTS) use the first-order derivative of the density function to tackle the aforementioned issues. However, the calculation of the derivative represents a bottleneck for computationally expensive models. We propose to first build a multi-fidelity Gaussian Process (GP) surrogate. The building block of the multi-fidelity surrogate is a hierarchy of models of decreasing approximation error and increasing computational cost. Then the generated multi-fidelity surrogate is used to approximate the derivative. The majority of the computation is assigned to the cheap models thereby reducing the overall computational cost. The derivative of the multi-fidelity method is used to explore the target density function and generate proposals. We select or reject the proposals using the Metropolis Hasting criterion using the highest fidelity model which ensures that the proposed method is ergodic with respect to the highest fidelity density function. We apply the proposed method to three test cases including some well-known benchmarks to compare it with existing methods and show that multi-fidelity No-U-turn sampling outperforms other methods.
Paper Structure (12 sections, 2 theorems, 16 equations, 6 figures, 1 algorithm)

This paper contains 12 sections, 2 theorems, 16 equations, 6 figures, 1 algorithm.

Key Result

lemma 1

The HMC algorithm is ergodic with respect to the canonical density function mentioned in (ravi-eq:canonical-density) provided the leapfrog integrator does not generate periodic proposals.

Figures (6)

  • Figure 1: Contour of low and high-fidelity functions the samples drawn from Rosenbrock function using different algorithms
  • Figure 2: mESS over the number of high-fidelity evaluations for Rosenbrock function
  • Figure 3: mESS over the number of high-fidelity evaluations for 8-d Gaussian test case.
  • Figure 4: Setup of the groundwater flow test case with $\theta = \left[ 0.75, 1.25, 0.8, 1.2 \right]$
  • Figure 5: mESS over the number of high-fidelity evaluations for steady-state groundwater flow case.
  • ...and 1 more figures

Theorems & Definitions (4)

  • lemma 1
  • proof
  • lemma 2
  • proof