Table of Contents
Fetching ...

Auxiliary MCMC and particle Gibbs samplers for parallelisable inference in latent dynamical systems

Adrien Corenflos, Simo Särkkä

TL;DR

The paper tackles scalable Bayesian inference for high‑dimensional latent dynamical systems by introducing auxiliary MCMC methods that augment the target with artificial observations. It develops two complementary routes: auxiliary Kalman samplers that exploit LGSSM structure (and extend to non‑Gaussian dynamics via local Gaussianisations) and auxiliary particle Gibbs samplers that use auxiliary observations to create locally informed proposals, including gradient‑informed and hybrid variants. These approaches enable linear‑time, memory‑efficient sampling for LGSSMs and parallel‑in‑time sampling via prefix‑sum or divide‑and‑conquer strategies, with theoretical and empirical gains in statistical efficiency and GPU scalability. The methods are demonstrated on multivariate stochastic volatility, a high‑dimensional spatio‑temporal model, and a continuous‑discrete diffusion smoothing problem, showing improved mixing (via ESJD/ESS) and favorable runtime performance relative to established baselines. This work advances practical Bayesian inference for complex latent dynamical systems by blending Gaussian approximation, local linearisation, and particle MCMC within a unified, parallelisable framework.

Abstract

Sampling from the full posterior distribution of high-dimensional non-linear, non-Gaussian latent dynamical models presents significant computational challenges. While Particle Gibbs (also known as conditional sequential Monte Carlo) is considered the gold standard for this task, it quickly degrades in performance as the latent space dimensionality increases. Conversely, globally Gaussian-approximated methods like extended Kalman filtering, though more robust, are seldom used for posterior sampling due to their inherent bias. We introduce novel auxiliary sampling approaches that address these limitations. By incorporating artificial observations of the system as auxiliary variables in our MCMC kernels, we develop both efficient exact Kalman-based samplers and enhanced Particle Gibbs algorithms that maintain performance in high-dimensional latent spaces. Some of our methods support parallelisation along the time dimension, achieving logarithmic scaling when implemented on GPUs. Empirical evaluations demonstrate superior statistical and computational performance compared to existing approaches for high-dimensional latent dynamical systems.

Auxiliary MCMC and particle Gibbs samplers for parallelisable inference in latent dynamical systems

TL;DR

The paper tackles scalable Bayesian inference for high‑dimensional latent dynamical systems by introducing auxiliary MCMC methods that augment the target with artificial observations. It develops two complementary routes: auxiliary Kalman samplers that exploit LGSSM structure (and extend to non‑Gaussian dynamics via local Gaussianisations) and auxiliary particle Gibbs samplers that use auxiliary observations to create locally informed proposals, including gradient‑informed and hybrid variants. These approaches enable linear‑time, memory‑efficient sampling for LGSSMs and parallel‑in‑time sampling via prefix‑sum or divide‑and‑conquer strategies, with theoretical and empirical gains in statistical efficiency and GPU scalability. The methods are demonstrated on multivariate stochastic volatility, a high‑dimensional spatio‑temporal model, and a continuous‑discrete diffusion smoothing problem, showing improved mixing (via ESJD/ESS) and favorable runtime performance relative to established baselines. This work advances practical Bayesian inference for complex latent dynamical systems by blending Gaussian approximation, local linearisation, and particle MCMC within a unified, parallelisable framework.

Abstract

Sampling from the full posterior distribution of high-dimensional non-linear, non-Gaussian latent dynamical models presents significant computational challenges. While Particle Gibbs (also known as conditional sequential Monte Carlo) is considered the gold standard for this task, it quickly degrades in performance as the latent space dimensionality increases. Conversely, globally Gaussian-approximated methods like extended Kalman filtering, though more robust, are seldom used for posterior sampling due to their inherent bias. We introduce novel auxiliary sampling approaches that address these limitations. By incorporating artificial observations of the system as auxiliary variables in our MCMC kernels, we develop both efficient exact Kalman-based samplers and enhanced Particle Gibbs algorithms that maintain performance in high-dimensional latent spaces. Some of our methods support parallelisation along the time dimension, achieving logarithmic scaling when implemented on GPUs. Empirical evaluations demonstrate superior statistical and computational performance compared to existing approaches for high-dimensional latent dynamical systems.
Paper Structure (38 sections, 2 theorems, 86 equations, 6 figures, 1 table, 11 algorithms)

This paper contains 38 sections, 2 theorems, 86 equations, 6 figures, 1 table, 11 algorithms.

Key Result

Proposition 3.1

The method of finke2021csmc, given in Algorithm alg:local-csmc, implements Algorithm alg:aux-pgibbs with $u^{k+1}_t \sim \mathcal{N}\left(u_t; x_t^{k}, \frac{\delta_t}{2} I\right)$ and proposal distributions $\tilde{p}(x_{t} \mid x_{t-1}) \sim \mathcal{N}\left(\cdot; u^{k+1}_t, \frac{\delta_t}{2} I

Figures (6)

  • Figure 1: Average (across 10 different experiments) expected squared jump distance per iteration for all the samplers considered on the stochastic volatility model of Section \ref{['subsec:stoch-vol']}.
  • Figure 2: Average (across 10 different experiments) expected squared jump distance per second for all the samplers considered on the stochastic volatility model.
  • Figure 3: Average (across 20 different experiments) expected squared jump distance per iteration and second for all the samplers considered on the spatio-temporal model \ref{['eq:spatio-temporal']}.
  • Figure 4: Heatmap of average square error of the estimated mean of the first time step $x_0$, scaled by the true standard deviation for the different samplers as a function of the observation noise $r^2$ and the autocorrelation of the latent dynamics $\rho$. Here "cSMC" stands for finke2021csmc, with (g) indicating gradient information (Section \ref{['subsubsec:diff']}), and (p) indicating parallelisation in time of corenflos2022sequentialized. "Kalman" stands for the first-order auxiliary Kalman samplers of Section \ref{['sec:auxiliary_samplers']}, and "Guided cSMC" stands for the methods of Sections \ref{['subsubsec:guided']} and \ref{['subsubsec:guided-diff']} (with (g) indicating gradient information for the latter).
  • Figure 5: Illustration of the Hillis--Steele prefix-sum algorithm. The algorithm performs $\lfloor{\log_2 T}\rfloor$ iterations, each of which using operations which are embarrassingly parallel.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Remark 2.1
  • Remark 2.2
  • Example 2.1
  • Remark 3.1
  • Remark 3.2
  • Proposition 3.1
  • Proposition A.1
  • proof