Table of Contents
Fetching ...

Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks

Benjamin Cox, Santiago Segarra, Victor Elvira

TL;DR

StateMixNN addresses state estimation in nonlinear state-space models with unknown transition dynamics by jointly learning a neural-network parameterised transition $f$ and proposal $\pi$ distribution, each approximated by Gaussian mixtures whose means and covariances come from neural networks. Training maximises the log-likelihood $\ell(\bm\theta|{\mathbf y}_{1:T})$ using only observations, while leveraging differentiable particle filtering to backpropagate through resampling via the stop-gradient technique. The method adopts diagonal covariances and equal mixture weights for computational efficiency, and uses an alternating training scheme to stabilize learning of the two networks. Empirical results on Lorenz 96 and Kuramoto NLSSMs show StateMixNN outperforms bootstrap PF and improved proposal methods, particularly as nonlinearity, dimensionality, and mixture richness increase, demonstrating practical applicability to unknown dynamical systems.

Abstract

State-space models are a popular statistical framework for analysing sequential data. Within this framework, particle filters are often used to perform inference on non-linear state-space models. We introduce a new method, StateMixNN, that uses a pair of neural networks to learn the proposal distribution and transition distribution of a particle filter. Both distributions are approximated using multivariate Gaussian mixtures. The component means and covariances of these mixtures are learnt as outputs of learned functions. Our method is trained targeting the log-likelihood, thereby requiring only the observation series, and combines the interpretability of state-space models with the flexibility and approximation power of artificial neural networks. The proposed method significantly improves recovery of the hidden state in comparison with the state-of-the-art, showing greater improvement in highly non-linear scenarios.

Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks

TL;DR

StateMixNN addresses state estimation in nonlinear state-space models with unknown transition dynamics by jointly learning a neural-network parameterised transition and proposal distribution, each approximated by Gaussian mixtures whose means and covariances come from neural networks. Training maximises the log-likelihood using only observations, while leveraging differentiable particle filtering to backpropagate through resampling via the stop-gradient technique. The method adopts diagonal covariances and equal mixture weights for computational efficiency, and uses an alternating training scheme to stabilize learning of the two networks. Empirical results on Lorenz 96 and Kuramoto NLSSMs show StateMixNN outperforms bootstrap PF and improved proposal methods, particularly as nonlinearity, dimensionality, and mixture richness increase, demonstrating practical applicability to unknown dynamical systems.

Abstract

State-space models are a popular statistical framework for analysing sequential data. Within this framework, particle filters are often used to perform inference on non-linear state-space models. We introduce a new method, StateMixNN, that uses a pair of neural networks to learn the proposal distribution and transition distribution of a particle filter. Both distributions are approximated using multivariate Gaussian mixtures. The component means and covariances of these mixtures are learnt as outputs of learned functions. Our method is trained targeting the log-likelihood, thereby requiring only the observation series, and combines the interpretability of state-space models with the flexibility and approximation power of artificial neural networks. The proposed method significantly improves recovery of the hidden state in comparison with the state-of-the-art, showing greater improvement in highly non-linear scenarios.

Paper Structure

This paper contains 19 sections, 18 equations, 11 figures, 5 algorithms.

Figures (11)

  • Figure 1: Frame diagram of Alg. \ref{['alg:method_outer_alg']}, $\mathrm{StateMixNN}(B, J, A, \mathbf y, \mathrm{NN}^{(f)}, \mathrm{NN}^{(\pi)})$.
  • Figure 2: Frame diagram of Alg. \ref{['alg:method_train_alg']}, $\mathrm{ConditionalUpdate}(B, J, \bm\theta_0, \bm\theta_\mathrm{static}, \mathbf y)$.
  • Figure 3: Frame diagram of Alg. \ref{['alg:method_update_step']}, $\mathrm{UpdateStep}(\bm\theta_\mathrm{learn}, \bm\theta_\mathrm{static}, \mathbf y)$.
  • Figure 4: Comparison of StateMixNN with the BPF, IAPF, and PropMixNN over variable numbers of particles. The lines denote mean performance, with bands denoting symmetric $95\%$ intervals.
  • Figure 5: Comparison of StateMixNN with the BPF, IAPF, and PropMixNN over variable series length. The lines denote mean performance, with bands denoting symmetric $95\%$ intervals.
  • ...and 6 more figures