Table of Contents
Fetching ...

Efficient Online Variational Estimation via Monte Carlo Sampling

Mathis Chagneux, Mathias Müller, Pierre Gloaguen, Sylvain Le Corff, Jimmy Olsson

TL;DR

The paper tackles online parameter and latent-state inference in parametric state-space models. It introduces Recursive Monte Carlo Variational Inference (RMCVI), a backward-factorized variational framework whose learning objective, COLBO, is justified via an extended Markov-chain ergodicity argument and stochastic approximation. The authors derive recursive ELBO and gradient recursions and provide a practical Monte Carlo estimator with backward sampling and variance reduction. Empirical results on linear-Gaussian HMMs, chaotic RNNs, and air-quality data demonstrate competitive accuracy and substantial computational efficiency compared with online SMC and regression-based VI baselines, highlighting the method’s scalability for streaming data applications.

Abstract

This article addresses online variational estimation in parametric state-space models. We propose a new procedure for efficiently computing the evidence lower bound and its gradient in a streaming-data setting, where observations arrive sequentially. The algorithm allows for the simultaneous training of the model parameters and the distribution of the latent states given the observations. It is based on i.i.d. Monte Carlo sampling, coupled with a well-chosen deep architecture, enabling both computational efficiency and flexibility. The performance of the method is illustrated on both synthetic data and real-world air-quality data. The proposed approach is theoretically motivated by the existence of an asymptotic contrast function and the ergodicity of the underlying Markov chain, and applies more generally to the computation of additive expectations under posterior distributions in state-space models.

Efficient Online Variational Estimation via Monte Carlo Sampling

TL;DR

The paper tackles online parameter and latent-state inference in parametric state-space models. It introduces Recursive Monte Carlo Variational Inference (RMCVI), a backward-factorized variational framework whose learning objective, COLBO, is justified via an extended Markov-chain ergodicity argument and stochastic approximation. The authors derive recursive ELBO and gradient recursions and provide a practical Monte Carlo estimator with backward sampling and variance reduction. Empirical results on linear-Gaussian HMMs, chaotic RNNs, and air-quality data demonstrate competitive accuracy and substantial computational efficiency compared with online SMC and regression-based VI baselines, highlighting the method’s scalability for streaming data applications.

Abstract

This article addresses online variational estimation in parametric state-space models. We propose a new procedure for efficiently computing the evidence lower bound and its gradient in a streaming-data setting, where observations arrive sequentially. The algorithm allows for the simultaneous training of the model parameters and the distribution of the latent states given the observations. It is based on i.i.d. Monte Carlo sampling, coupled with a well-chosen deep architecture, enabling both computational efficiency and flexibility. The performance of the method is illustrated on both synthetic data and real-world air-quality data. The proposed approach is theoretically motivated by the existence of an asymptotic contrast function and the ergodicity of the underlying Markov chain, and applies more generally to the computation of additive expectations under posterior distributions in state-space models.
Paper Structure (39 sections, 7 theorems, 128 equations, 6 figures, 3 tables, 3 algorithms)

This paper contains 39 sections, 7 theorems, 128 equations, 6 figures, 3 tables, 3 algorithms.

Key Result

Proposition 4.1

For every $t \in \mathbb{N}$ and $(\theta, \phi) \in \Theta \times \Phi$, the ELBO and its gradient are given by where the real-valued function $h_{t}$ on $\mathsf{X}$ and its gradients $u_{t} \vcentcolon= \nabla_{\phi} h_{t}$ and $v_{t} \vcentcolon= \nabla_{\theta} h_{t}$ satisfy the recursions with $h_{0}(x_0) = \ell_{0}^{\, \theta, \phi}(x_{-1}, x_{0}),u_{0}(x_0) = 0,$ and $v_{0}(x_0)=\nabla_

Figures (6)

  • Figure 1: (a) Evolution of $\widehat{\mathcal{L}}^{\theta, \phi}_{t}/t$ during the online learning in the linear Gaussian SSM. Vertical lines indicate times at which state estimation is made on a test sequence. (b) State estimation on a test sequence (only one particular dimension is displayed).
  • Figure 2: Model-parameter learning in the linear–Gaussian HMM. Mean-absolute errors of the transition ($F$) and emission ($G$) matrices for our method and campbell2021online.
  • Figure 3: (a) Parameter MAE. (b) Approximate $\widetilde{\mathcal{L}}_t/t$ with checkpoint markers. (c) State estimation on a test sequence for one latent dimension. Colored lines/markers correspond to the same checkpoints.
  • Figure 4: Simultaneous Variational Learning and Prediction. (a) Smoothing reconstruction during sensor failure (black dotted). RMCVI (orange) recovers the Truth (gray) while LSTM (blue) overfits the corruption. (b) Cumulative RMSE for one-step-ahead prediction averaged across all pollution features.
  • Figure 5: Evolution of $\mathcal{L}_{T}^{\theta, \phi}/T$ computed with three different methods and with three different types of gradients estimates. Full lines: means of the 10 replicates. Shaded lines: standard deviations of the 10 replicates.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Proposition 4.1
  • proof
  • Theorem 4.5: Geometric ergodicity of $(Z_t)_{t \in \mathbb{N}}$
  • proof
  • Definition C.4
  • Theorem C.5
  • Lemma C.6
  • Lemma C.7
  • proof : Proof of \ref{['lem:ump:explicit:form']}
  • Lemma C.8
  • ...and 7 more