Table of Contents
Fetching ...

Piecewise deterministic sampling with splitting schemes

Andrea Bertazzi, Paul Dobson, Pierre Monmarché

TL;DR

The paper develops a splitting-scheme framework for piecewise-deterministic Markov process (PDMP) based MCMC, enabling second-order weak accuracy (in the step size) with a cost of one gradient evaluation per iteration. It introduces unadjusted schemes built from Strang-type splits and augments them with non-reversible Metropolis adjustments to recover exact invariant measures, providing explicit rejection rates and conditions for ergodicity. Theoretical results establish second-order weak error, geometric ergodicity, and an invariant-measure expansion mu_delta = mu (1 - delta^2 f_2 + O(delta^4)), with explicit one-dimensional characterizations for certain splittings. The work analyzes how splitting structure and refreshment rates affect bias and convergence, and demonstrates practical benefits through numerical experiments in Bayesian imaging and interacting-particle systems. Overall, the approach offers a scalable, robust toolkit for efficient, bias-controlled PDMP-based MCMC, with clear guidance on which splittings perform best in practice.

Abstract

We introduce Markov chain Monte Carlo (MCMC) algorithms based on numerical approximations of piecewise-deterministic Markov processes obtained with the framework of splitting schemes. We present unadjusted as well as adjusted algorithms, for which the asymptotic bias due to the discretisation error is removed applying a non-reversible Metropolis-Hastings filter. In a general framework we demonstrate that the unadjusted schemes have weak error of second order in the step size, while typically maintaining a computational cost of only one gradient evaluation of the negative log-target function per iteration. Focusing then on unadjusted schemes based on the Bouncy Particle and Zig-Zag samplers, we provide conditions ensuring geometric ergodicity and consider the expansion of the invariant measure in terms of the step size. We analyse the dependence of the leading term in this expansion on the refreshment rate and on the structure of the splitting scheme, giving a guideline on which structure is best. Finally, we illustrate promising results for our samplers with numerical experiments on a Bayesian imaging inverse problem and a system of interacting particles.

Piecewise deterministic sampling with splitting schemes

TL;DR

The paper develops a splitting-scheme framework for piecewise-deterministic Markov process (PDMP) based MCMC, enabling second-order weak accuracy (in the step size) with a cost of one gradient evaluation per iteration. It introduces unadjusted schemes built from Strang-type splits and augments them with non-reversible Metropolis adjustments to recover exact invariant measures, providing explicit rejection rates and conditions for ergodicity. Theoretical results establish second-order weak error, geometric ergodicity, and an invariant-measure expansion mu_delta = mu (1 - delta^2 f_2 + O(delta^4)), with explicit one-dimensional characterizations for certain splittings. The work analyzes how splitting structure and refreshment rates affect bias and convergence, and demonstrates practical benefits through numerical experiments in Bayesian imaging and interacting-particle systems. Overall, the approach offers a scalable, robust toolkit for efficient, bias-controlled PDMP-based MCMC, with clear guidance on which splittings perform best in practice.

Abstract

We introduce Markov chain Monte Carlo (MCMC) algorithms based on numerical approximations of piecewise-deterministic Markov processes obtained with the framework of splitting schemes. We present unadjusted as well as adjusted algorithms, for which the asymptotic bias due to the discretisation error is removed applying a non-reversible Metropolis-Hastings filter. In a general framework we demonstrate that the unadjusted schemes have weak error of second order in the step size, while typically maintaining a computational cost of only one gradient evaluation of the negative log-target function per iteration. Focusing then on unadjusted schemes based on the Bouncy Particle and Zig-Zag samplers, we provide conditions ensuring geometric ergodicity and consider the expansion of the invariant measure in terms of the step size. We analyse the dependence of the leading term in this expansion on the refreshment rate and on the structure of the splitting scheme, giving a guideline on which structure is best. Finally, we illustrate promising results for our samplers with numerical experiments on a Bayesian imaging inverse problem and a system of interacting particles.
Paper Structure (51 sections, 34 theorems, 327 equations, 10 figures, 2 tables, 9 algorithms)

This paper contains 51 sections, 34 theorems, 327 equations, 10 figures, 2 tables, 9 algorithms.

Key Result

Theorem 6

Let $Z_t$ be a PDMP corresponding to the generator eq:genPDMP. Assume that Assumption ass:determisticflowmap to Assumption ass:momentcond hold. Then there exist constants $C,R$ such that for any $g\in \mathcal{C}_\Phi^{2,0}\cap D(\mathcal{L})$ we have for some $M\in\mathbb{N}$

Figures (10)

  • Figure 1: Left: total variation distance to the target up to second order obtained with Proposition \ref{['prop:f_2_1D']}, and right: square root MSE in the estimation of the radius statistic, i.e. $x^2$, for a one-dimensional standard normal target distribution. In both plots $\delta = 0.5$, while for the right plot the number of iterations is $N=2 \cdot 10^5$, the position is initialised at the target distribution and the velocity from the uniform distribution on the unit sphere, and the experiment is repeated $300$ times.
  • Figure 2: Results for the reconstruction of cameraman image using a TV prior. The first row shows the original image $x$, the data $y$ and the posterior mean using a long run of MYMALA. Each panel (d)-(i) shows the posterior mean obtained after $10^6$ iterations with the labelled algorithm.
  • Figure 3: Results for the pixelwise standard deviation for the cameraman image using a TV prior.
  • Figure 4: IMMSE between posterior mean (left) and pixelwise posterior standard deviation (right) estimated by the algorithms ULA, UBU, UZZS, and the continuous ZZS using a long run of MYMALA as a reference for the truth.
  • Figure 5: Numerical simulations in the setting of Section \ref{['subsec:example_particles']}. In all plots the $y$-axis shows the relative mean square error (MSE), that is the MSE normalised by the square of the true value, which in this case corresponds to a long run of the Metropolis-adjusted BPS. The left and right plots show the relative MSE for the estimation respectively of the mean and variance of the test function \ref{['eq:test_particles']}. The initial position of each particle is obtain drawing from the standard Gaussian distribution.
  • ...and 5 more figures

Theorems & Definitions (45)

  • Example 1: Zig-Zag sampler, ZZ
  • Example 2: Bouncy Particle Sampler, BPS
  • Theorem 6
  • Example 3: ZZS continued
  • Theorem 8
  • Theorem 10
  • Theorem 11
  • Remark 12
  • Remark 14
  • Remark 15
  • ...and 35 more