Table of Contents
Fetching ...

Diffusive Scaling Limits of Forward Event-Chain Monte Carlo: Provably Efficient Exploration with Partial Refreshment

Hirofumi Shiba, Kengo Kamatani

TL;DR

A high-dimensional scaling analysis for standard Gaussian targets is developed and it is proved that the negative log-density process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS.

Abstract

Piecewise deterministic Markov process samplers are attractive alternatives to Metropolis--Hastings algorithms. A central design question is how to incorporate partial velocity refreshment to ensure ergodicity without injecting excessive noise. Forward Event-Chain Monte Carlo (FECMC) is a generalization of the Bouncy Particle Sampler (BPS) that addresses this issue through a stochastic reflection mechanism, thereby reducing reliance on global refreshment moves. Despite promising empirical performance, its theoretical efficiency remains largely unexplored. We develop a high-dimensional scaling analysis for standard Gaussian targets and prove that the negative log-density (or potential) process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS. We derive closed-form expressions for the limiting diffusion coefficients of both methods by analyzing their associated radial momentum processes and solving the corresponding Poisson equations. These expressions yield a sharp efficiency comparison: the diffusion coefficient of FECMC is strictly larger than that of optimally tuned BPS, and the optimum for FECMC is attained at zero global refreshment. Specifically, they imply an approximately eightfold increase in effective sample size per event over optimal BPS. Numerical experiments confirm the predicted diffusion coefficients and show that the resulting efficiency gains remain substantial for a range of non-Gaussian targets. Finally, as an application of these results, we propose an asymptotic variance estimator for Piecewise deterministic Markov processes that becomes increasingly efficient in high dimensions by extracting information from the velocity variable.

Diffusive Scaling Limits of Forward Event-Chain Monte Carlo: Provably Efficient Exploration with Partial Refreshment

TL;DR

A high-dimensional scaling analysis for standard Gaussian targets is developed and it is proved that the negative log-density process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS.

Abstract

Piecewise deterministic Markov process samplers are attractive alternatives to Metropolis--Hastings algorithms. A central design question is how to incorporate partial velocity refreshment to ensure ergodicity without injecting excessive noise. Forward Event-Chain Monte Carlo (FECMC) is a generalization of the Bouncy Particle Sampler (BPS) that addresses this issue through a stochastic reflection mechanism, thereby reducing reliance on global refreshment moves. Despite promising empirical performance, its theoretical efficiency remains largely unexplored. We develop a high-dimensional scaling analysis for standard Gaussian targets and prove that the negative log-density (or potential) process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS. We derive closed-form expressions for the limiting diffusion coefficients of both methods by analyzing their associated radial momentum processes and solving the corresponding Poisson equations. These expressions yield a sharp efficiency comparison: the diffusion coefficient of FECMC is strictly larger than that of optimally tuned BPS, and the optimum for FECMC is attained at zero global refreshment. Specifically, they imply an approximately eightfold increase in effective sample size per event over optimal BPS. Numerical experiments confirm the predicted diffusion coefficients and show that the resulting efficiency gains remain substantial for a range of non-Gaussian targets. Finally, as an application of these results, we propose an asymptotic variance estimator for Piecewise deterministic Markov processes that becomes increasingly efficient in high dimensions by extracting information from the velocity variable.
Paper Structure (39 sections, 25 theorems, 145 equations, 5 figures)

This paper contains 39 sections, 25 theorems, 145 equations, 5 figures.

Key Result

Proposition 2.1

Assume that the potential $U(x)$ is continuously differentiable and $C_c^1(\mathbb{R}^d)$ is a core of the generator $L$ given in eq-PDMP-generator. If for every $C^1$-function $f$ with compact support, where $(x)_-\coloneq \min(x,0)$, then the product distribution $\pi\otimes\mu$ is the invariant distribution of the PDMP corresponding to the generator $L$.

Figures (5)

  • Figure 1: Diffusion coefficient $\sigma$ of the limiting negative log-density (potential) process vs. the global refreshment rate $\rho$ (FECMC vs. BPS). Standard Gaussian target with spherical velocity. Curves computed from the analytic expressions in Theorem \ref{['thm-formula-for-sigma']}.
  • Figure 2: Estimated ESS (Left) and ESS per CPU second (Right), together with 95% BCa bootstrap confidence intervals, against dimensionality. The parenthesized values below the $x$-ticks represent the ESS mean ratio of FECMC to BPS at each dimension. The estimator is given by $\widehat{\ESS}_T=1/{\widehat{\MSE}_T}$ for $T=100$. Both plots are based on 1000 independent runs of BPS and FECMC, targeting the standard Gaussian distribution $\pi_1(x)\propto\exp(-\lvert x\rvert^2/2)$. The two black lines in the left plot represent the theoretical limiting values \ref{['eq-ESS']} when $d,T\to\infty$ for FECMC and BPS respectively.
  • Figure 3: Estimated ESS against dimensionality, together with 95% BCa bootstrap confidence intervals. The target distribution is either the i.i.d. logistic distribution $\pi_2(x)=\prod_{i=1}^d\frac{e^{x_i}}{(1+e^{x_i})^2}$ (left plot) or the anisotropic Gaussian distribution $\pi_3\propto\exp(-x^\top\Sigma^{-1}x/2)$, where $\Sigma_{ii}=1$ and $\Sigma_{ij}=0.5$ for $i\ne j$ (right plot). In both cases, each algorithm is run 1000 times independently with the time horizon $T=100$.
  • Figure 4: Estimated ESS against deviation parameters $\gamma,\nu^{-1}$, which quantify departures from isotropy and Gaussianity, together with 95% BCa bootstrap confidence intervals. The target distribution is either the anisotropic Gaussian $\pi_3(x)\propto\exp(-x^\top\Sigma^{-1}x/2)$ with varying correlations $\Sigma_{ij}=\gamma\in\{0,0.1,0.2,\cdots,0.9\}$ (left plot) or the spherically symmetric Student distribution $\pi_4(x)\propto(1+\lvert x\rvert^2/\nu)^{-(d+\nu)/2}$ with varying degrees of freedom $\nu\in\{10,10^2,10^3,10^4\}$ (right plot). The black lines denote theoretical limiting values derived under the standard Gaussian assumption in Eq. \ref{['eq-ESS']} In both cases, the dimension is $d=100$ and each algorithm is run 1000 times independently with the time horizon $T=100$.
  • Figure 5: Boxplots of two estimators, $\widehat{\varsigma^2}_{\textrm{slow}}$ and $\widehat{\varsigma^2}_\textrm{fast}$, over 100 runs against increasing dimensions $d$. The targets are either the standard Gaussian distribution $\pi_1(x)\propto\exp(-\lvert x\rvert^2/2)$ (left plot) or the anisotropic Gaussian distribution $\pi_3(x)\propto\exp(-x^\top\Sigma^{-1}x/2)$ with $\Sigma_{ii}=1$ and $\Sigma_{ij}=0.5$ for $i\ne j$ (right plot). The black lines in both plots represent the (proxy of) true values. The subplot on the bottom represents the mean squared error (MSE) of each estimator.

Theorems & Definitions (53)

  • Proposition 2.1
  • Proposition 3.2: Limit of the Radial Momentum Process
  • Corollary 3.3: Number of Velocity Jumps
  • Remark 3.4: Asymptotic Ratio of Mean Jump Frequencies
  • Remark 3.5: Asymptotic Ratio of ESS per Event
  • Proposition 3.6: Exponential Ergodicity of $R^F$
  • Theorem 3.7: Scaling Limit of the Potential Process when $\rho=0$
  • Remark 3.8: Comparison with BPS
  • Theorem 3.9: Scaling Limit of the Potential Process when $\rho>0$
  • Remark 3.10: Continuity at $\rho=0$
  • ...and 43 more