Table of Contents
Fetching ...

Debiasing Piecewise Deterministic Markov Process samplers using couplings

Adrien Corenflos, Matthew Sutton, Nicolas Chopin

TL;DR

This work extends Monte Carlo estimators to the continuous‐time context and derive couplings for the bouncy, the boomerang, and the coordinate samplers.

Abstract

Monte Carlo methods -- such as Markov chain Monte Carlo (MCMC) and piecewise deterministic Markov process (PDMP) samplers -- provide asymptotically exact estimators of expectations under a target distribution. There is growing interest in alternatives to this asymptotic regime, in particular in constructing estimators that are exact in the limit of an infinite amount of computing processors, rather than in the limit of an infinite number of Markov iterations. In particular, Jacob et al. (2020) introduced coupled MCMC estimators to remove the non-asymptotic bias, resulting in MCMC estimators that can be embarrassingly parallelised. In this work, we extend the estimators of Jacob et al. (2020) to the continuous-time context and derive couplings for the bouncy, the boomerang and the coordinate samplers. Some preliminary empirical results are included that demonstrate the reasonable scaling of our method with the dimension of the target.

Debiasing Piecewise Deterministic Markov Process samplers using couplings

TL;DR

This work extends Monte Carlo estimators to the continuous‐time context and derive couplings for the bouncy, the boomerang, and the coordinate samplers.

Abstract

Monte Carlo methods -- such as Markov chain Monte Carlo (MCMC) and piecewise deterministic Markov process (PDMP) samplers -- provide asymptotically exact estimators of expectations under a target distribution. There is growing interest in alternatives to this asymptotic regime, in particular in constructing estimators that are exact in the limit of an infinite amount of computing processors, rather than in the limit of an infinite number of Markov iterations. In particular, Jacob et al. (2020) introduced coupled MCMC estimators to remove the non-asymptotic bias, resulting in MCMC estimators that can be embarrassingly parallelised. In this work, we extend the estimators of Jacob et al. (2020) to the continuous-time context and derive couplings for the bouncy, the boomerang and the coordinate samplers. Some preliminary empirical results are included that demonstrate the reasonable scaling of our method with the dimension of the target.
Paper Structure (39 sections, 9 theorems, 47 equations, 4 figures, 1 table, 10 algorithms)

This paper contains 39 sections, 9 theorems, 47 equations, 4 figures, 1 table, 10 algorithms.

Key Result

Proposition 2.1

Assumption ass:integrability-v2 implies Assumption ass:integrability-v1.

Figures (4)

  • Figure 1: Gaussian distribution scaling example. Average $\Delta$-coupling times of 500 coupled PDMPs for BPS (left), Coordinate sampler (middle) and Boomerang sampler (right). The top row corresponds to the average meeting time $\kappa$ per definition \ref{['def:async-coupling']} and the bottom row corresponds to the average computational cost.
  • Figure 2: Logistic regression example: tuning parameters. Inefficiency measured as the average number of gradient evaluations multiplied by sum of the variance of estimators for BPS (left) and Boomerang (right).
  • Figure 3: Scatter plot of the coupling for $\Xi' + \mu - \Xi$ and $\Xi$ for different choices of $\beta$ in Theorem \ref{['thm:coupling-rep']}. The three different choices of residual coupling have the same behaviour "when the marginals are coupled", i.e., right of $0.5$ on the x-axis, but result in staunchly different behaviours otherwise.
  • Figure 4: Tuning $\Delta$ for coupling efficiency in the Gaussian distribution scaling example. Plots show the average meeting time $\kappa$ for coupled BPS, coupled Boomerang and the coupled Coordinate Sampler when sampling $N(0_8, I_8)$.

Theorems & Definitions (25)

  • Definition 2.1: $\Delta$-coupling
  • Remark 2.1: Suboptimality of Assumption \ref{['ass:coupling-tails']}
  • Proposition 2.1
  • proof
  • Definition 2.2: Discretised Rhee & Glynn, DRG
  • Proposition 2.2
  • proof
  • Definition 2.3: Averaged Discretised Rhee & Glynn, ADRG
  • Corollary 2.1
  • Remark 2.2
  • ...and 15 more