Table of Contents
Fetching ...

Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Conor Rosato, Joshua Murphy, Alessandro Varsi, Paul Horridge, Simon Maskell

TL;DR

This work extends SMC$^2$ by incorporating first-order gradient information estimated from a differentiable CRN-PF to drive Langevin proposals. The approach improves sampling efficiency and parameter recovery in nonlinear, non-Gaussian state-space models, demonstrated on LGSSM and SIR with higher ESS and lower MSE than random-walk proposals. It also achieves scalable parallelism on distributed memory hardware with an $O(\log_2 N)$ resampling complexity and substantial speed-ups. The paper discusses potential future enhancements with HMC/NUTS and GPU-based architectures and provides code for reproduction.

Abstract

Sequential Monte Carlo Squared (SMC$^2$) is a Bayesian method which can infer the states and parameters of non-linear, non-Gaussian state-space models. The standard random-walk proposal in SMC$^2$ faces challenges, particularly with high-dimensional parameter spaces. This study outlines a novel approach by harnessing first-order gradients derived from a Common Random Numbers - Particle Filter (CRN-PF) using PyTorch. The resulting gradients can be leveraged within a Langevin proposal without accept/reject. Including Langevin dynamics within the proposal can result in a higher effective sample size and more accurate parameter estimates when compared with the random-walk. The resulting algorithm is parallelized on distributed memory using Message Passing Interface (MPI) and runs in $\mathcal{O}(\log_2N)$ time complexity. Utilizing 64 computational cores we obtain a 51x speed-up when compared to a single core. A GitHub link is given which provides access to the code.

Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

TL;DR

This work extends SMC by incorporating first-order gradient information estimated from a differentiable CRN-PF to drive Langevin proposals. The approach improves sampling efficiency and parameter recovery in nonlinear, non-Gaussian state-space models, demonstrated on LGSSM and SIR with higher ESS and lower MSE than random-walk proposals. It also achieves scalable parallelism on distributed memory hardware with an resampling complexity and substantial speed-ups. The paper discusses potential future enhancements with HMC/NUTS and GPU-based architectures and provides code for reproduction.

Abstract

Sequential Monte Carlo Squared (SMC) is a Bayesian method which can infer the states and parameters of non-linear, non-Gaussian state-space models. The standard random-walk proposal in SMC faces challenges, particularly with high-dimensional parameter spaces. This study outlines a novel approach by harnessing first-order gradients derived from a Common Random Numbers - Particle Filter (CRN-PF) using PyTorch. The resulting gradients can be leveraged within a Langevin proposal without accept/reject. Including Langevin dynamics within the proposal can result in a higher effective sample size and more accurate parameter estimates when compared with the random-walk. The resulting algorithm is parallelized on distributed memory using Message Passing Interface (MPI) and runs in time complexity. Utilizing 64 computational cores we obtain a 51x speed-up when compared to a single core. A GitHub link is given which provides access to the code.
Paper Structure (18 sections, 56 equations, 3 figures, 3 tables)

This paper contains 18 sections, 56 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) Runtime and (b) speed-up plots of the LGSSM when changing number of $P = 1, 2, 4, \dots, 64$ computational cores.
  • Figure 2: Convergence plots depicting the parameters of the LGSSM in two dimensions when using the RW (blue line with star markers) and first-order (red line with circle markers) proposals over $K=15$ iterations and average over 5 Monte-carlo runs. The true values are outlined with the black solid and dot/dashed lines.
  • Figure 3: Convergence plots of (a) $\beta$ and (b) $\gamma$ of the SIR disease model outlined in \ref{['SIRR:s']}-\ref{['SIRR:r']} when using the RW (blue solid line) and first-order (orange dot/dashed line) proposals over $K=15$ iterations and averaged over 5 Monte-carlo runs. The true values are signalled by the horizontal red dashed line. The average MSE of $\beta$ and $\gamma$ plotted at each iteration can be seen in (c).