Table of Contents
Fetching ...

Optimised Annealed Sequential Monte Carlo Samplers

Saifuddin Syed, Alexandre Bouchard-Côté, Kevin Chern, Arnaud Doucet

TL;DR

The paper develops Optimised Annealed Sequential Monte Carlo (OASMC) and Optimised AIS (OAIS) by embedding ASMC into a dense annealing schedule and modeling the variance of the normalising-constant estimator through local/global barriers. It shows that the total discrepancy along the annealing path governs asymptotic efficiency, and that the optimal schedule corresponds to a geodesic on the annealing manifold, minimizing kinetic energy. It provides a round-based, deterministic framework with unbiased $Z$-estimation, introduces memory-efficient AIS, and demonstrates substantial GPU speedups alongside an open-source GPU implementation. The work offers practical guidelines for tuning annealing schedules, resampling strategies, and kernel choices, supported by theoretical guarantees and extensive numerical experiments across diverse models. Overall, the methods enable predictable runtimes, improved scalability, and strong performance on modern hardware for Bayesian inference tasks that rely on normalising constant estimation.

Abstract

Annealed Sequential Monte Carlo (ASMC) samplers are special cases of SMC samplers where the sequence of distributions can be embedded in a smooth path of distributions. Using this underlying path and a performance model based on the variance of the normalising constant estimator, we systematically study dense-schedule limits. From our theory emerges a notion of global barrier, capturing the inherent complexity of normalising constant approximation under our performance model. We then turn the resulting approximations into surrogate objective functions of algorithm performance, using them to guide method development. This leads to novel adaptive methods, Optimised Annealed SMC (OASMC), which address practical difficulties inherent in previous adaptive SMC methods. First, our OASMC algorithms are predictable: they produce a sequence of increasingly precise estimates at deterministic, known times. Second, Optimised Annealed Importance Sampling (OAIS), a special case of OASMC, enables schedule adaptation at a memory cost constant in the number of particles, requiring significantly less communication. Finally, these characteristics make OAIS highly efficient on GPUs. We provide an open-source, high-performance GPU implementation of our method and demonstrate up to a hundred-fold speed improvement compared to state-of-the-art adaptive AIS methods.

Optimised Annealed Sequential Monte Carlo Samplers

TL;DR

The paper develops Optimised Annealed Sequential Monte Carlo (OASMC) and Optimised AIS (OAIS) by embedding ASMC into a dense annealing schedule and modeling the variance of the normalising-constant estimator through local/global barriers. It shows that the total discrepancy along the annealing path governs asymptotic efficiency, and that the optimal schedule corresponds to a geodesic on the annealing manifold, minimizing kinetic energy. It provides a round-based, deterministic framework with unbiased -estimation, introduces memory-efficient AIS, and demonstrates substantial GPU speedups alongside an open-source GPU implementation. The work offers practical guidelines for tuning annealing schedules, resampling strategies, and kernel choices, supported by theoretical guarantees and extensive numerical experiments across diverse models. Overall, the methods enable predictable runtimes, improved scalability, and strong performance on modern hardware for Bayesian inference tasks that rely on normalising constant estimation.

Abstract

Annealed Sequential Monte Carlo (ASMC) samplers are special cases of SMC samplers where the sequence of distributions can be embedded in a smooth path of distributions. Using this underlying path and a performance model based on the variance of the normalising constant estimator, we systematically study dense-schedule limits. From our theory emerges a notion of global barrier, capturing the inherent complexity of normalising constant approximation under our performance model. We then turn the resulting approximations into surrogate objective functions of algorithm performance, using them to guide method development. This leads to novel adaptive methods, Optimised Annealed SMC (OASMC), which address practical difficulties inherent in previous adaptive SMC methods. First, our OASMC algorithms are predictable: they produce a sequence of increasingly precise estimates at deterministic, known times. Second, Optimised Annealed Importance Sampling (OAIS), a special case of OASMC, enables schedule adaptation at a memory cost constant in the number of particles, requiring significantly less communication. Finally, these characteristics make OAIS highly efficient on GPUs. We provide an open-source, high-performance GPU implementation of our method and demonstrate up to a hundred-fold speed improvement compared to state-of-the-art adaptive AIS methods.
Paper Structure (131 sections, 2 theorems, 205 equations, 14 figures, 3 algorithms)

This paper contains 131 sections, 2 theorems, 205 equations, 14 figures, 3 algorithms.

Key Result

Corollary 1

Suppose assump:regular-proposalassump:regular-weightsassump:non-degeneracyassump:schedule_generator hold. Then:

Figures (14)

  • Figure 1: Left: Particle evolution for AIS (top) and DAR (middle) and SIR (bottom). Middle: Phase diagram identifying particle stability regimes when $T\sim \Lambda^{\alpha_T}$ and $\rho=\Theta(\Lambda^{\alpha_\rho})$. The unstable regime (red) requires exponentially many particles, the stable regime (green) requires a sublinear number, and the strongly stable regime (yellow) requires only a constant number. We also plot the stability regimes for AIS (dashed), DAR (solid), and SIR (dotted) as the schedule density $\alpha_T$ increases. Right: Discretised annealing paths between two Gaussian distributions at increasing schedule densities: $\alpha_T\in (0,1)$ (bottom), $\alpha_T\in (1,2)$ (middle), and $\alpha_T\in (2,\infty)$ (top).
  • Figure 2: Relative ESS $N_\text{eff}(t)/N$ as a function of iteration for OASMC targeting the Ising model. Facet columns denote adaptation round $k$ in Algorithm \ref{['alg:OASMC']}. The top facet row shows OAIS, bottom facet row, OASMC. For OASMC, Adaptive resampling (AR) (\ref{['sec:resampling']}) is triggered when $N_\text{eff}(t)/N < \nu=1/2$ (dotted red line). As predicted by theoretical results for DAR in \ref{['sec:high-barrier-scaling-resampling']}, in the unstable schedule regime (small $k$), resampling occurs (red square) at every iteration; as $k$ increases, resampling becomes periodic and less frequent; in the strongly stable schedule regime (large $k$), no resampling occurs and OAIS and OASMC are equivalent in that regime.
  • Figure 3: Top row: Evolution of the schedule generators, plotting the annealing parameter $\beta_t$ as a function of the normalised schedule step $u_t=t/T$ over 6 rounds of OASMC (facet columns) with five random seeds (colours) on an ODE Bayesian parameter estimation problem for mRNA transfection data leonhardt_single-cell_2014. Bottom row: method of Zhou2016 (labelled ZJA), each round here corresponds to a given budget, and we seek to compare that round's schedule to that produced by Zhou2016 under the same budget. To achieve this, guided by our theoretical results we set the line search threshold in Zhou2016 using $D(\beta_{t-1}^*, \beta_t^*) \approx \Lambda^2/T^2$, where $\Lambda \approx 16$ is estimated from a separate, large run (timing of that run not included for fairness of comparison).
  • Figure 4: A flowchart recommending an annealing algorithm based on goals and various information on the computing architecture. See Sections \ref{['sec:choosing']} and \ref{['sec:expectations']}.
  • Figure 5: Estimates of the local barrier $\lambda$ for six inference problems (facets) obtained with OASMC for 20 rounds. Each curve is obtained by differentiating the cubic spline computed by Algorithm \ref{['alg:schedule-update']} in the last round. The different lines denote an increasing expected number of updates per component of the target during the propagation step of \ref{['alg:AMCS']}. For example, the curve with label $0.5$ means that only a random half of the variables are updated by $M_{\beta, \beta'}$.
  • ...and 9 more figures

Theorems & Definitions (17)

  • Corollary 1
  • Theorem 1
  • proof
  • proof : Proof of \ref{['lem:adaptive_SMCS_2']}
  • proof : Proof of \ref{['prop:adaptive_SMCS']}
  • proof : Proof of \ref{['lem:product-rule']}
  • proof : Proof of \ref{['lem:regularity-measures']}
  • proof : Proof of \ref{['lem:regularity']}
  • proof : Proof of \ref{['thm:incremental_discrepancy_estimate']}
  • proof
  • ...and 7 more