Table of Contents
Fetching ...

Probabilistic Recurrent State-Space Models

Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, Sebastian Trimpe

TL;DR

The paper tackles learning probabilistic, nonlinear, partially observable dynamical systems by introducing PR-SSM, a Gaussian-process state-space framework that preserves temporal correlations in the latent state. It couples a variational sparse-GP prior over latent transitions with a doubly stochastic inference scheme and a recognition model for robust initialization, enabling scalable training on long sequences. PR-SSM yields probabilistic predictions with calibrated uncertainty and demonstrates strong performance and robustness on real-world benchmarks and a large-scale robotic dataset, outperforming a leading Markovian GP-SSM and competing with latent autoregressive methods. The work advances scalable, principled Bayesian system identification and offers practical benefits for control and reinforcement learning tasks requiring uncertainty-aware dynamics models.

Abstract

State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulation and a scalable training algorithm based on doubly stochastic variational inference and Gaussian processes. In contrast to existing work, the proposed variational approximation allows one to fully capture the latent state temporal correlations. These correlations are the key to robust training. The effectiveness of the proposed PR-SSM is evaluated on a set of real-world benchmark datasets in comparison to state-of-the-art probabilistic model learning methods. Scalability and robustness are demonstrated on a high dimensional problem.

Probabilistic Recurrent State-Space Models

TL;DR

The paper tackles learning probabilistic, nonlinear, partially observable dynamical systems by introducing PR-SSM, a Gaussian-process state-space framework that preserves temporal correlations in the latent state. It couples a variational sparse-GP prior over latent transitions with a doubly stochastic inference scheme and a recognition model for robust initialization, enabling scalable training on long sequences. PR-SSM yields probabilistic predictions with calibrated uncertainty and demonstrates strong performance and robustness on real-world benchmarks and a large-scale robotic dataset, outperforming a leading Markovian GP-SSM and competing with latent autoregressive methods. The work advances scalable, principled Bayesian system identification and offers practical benefits for control and reinforcement learning tasks requiring uncertainty-aware dynamics models.

Abstract

State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulation and a scalable training algorithm based on doubly stochastic variational inference and Gaussian processes. In contrast to existing work, the proposed variational approximation allows one to fully capture the latent state temporal correlations. These correlations are the key to robust training. The effectiveness of the proposed PR-SSM is evaluated on a set of real-world benchmark datasets in comparison to state-of-the-art probabilistic model learning methods. Scalability and robustness are demonstrated on a high dimensional problem.

Paper Structure

This paper contains 28 sections, 21 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Graphical model of the PR-SSM. Gray nodes are observed variables in contrast to latent variables in white nodes. Thick lines indicate variables, which are jointly Gaussian under a GP prior.
  • Figure 2: Predictions of the initial, untrained (left) and the final, trained PR-SSM (right) based on the full gradient ELBO optimization. The system input/output data (blue) is visualized together with the model prediction (orange) for a part of the Furnace dataset. Samples of the latent space distribution and output distribution are shown in gray. The shaded areas visualize mean +/- two std. The initial model exhibits a random walk behavior in the latent space. In the trained model, the decay of the initial state uncertainty can be observed in the first time steps. In this experiment, no recognition model has been used (cf. Sec. \ref{['sec:ExtensionsforLargeDatasets']}).
  • Figure 3: Comparison of the fully trained PR-SSM predictions with (lower row) and without (upper row) initial state recognition model on a test dataset. The initial transient based on the uncertainty from an uninformative initial state distribution $q(\bm{x}_1) = \mathcal{N}(\bm{x}_1 \mid \bm{0}, \bm{I})$ decays, as shown in upper plots. Below the outcome is shown when $q(\bm{x}_1)$ is initialized by the smoothing distribution $q(\bm{x}_1 \mid \bm{y}_{1:L}, \bm{u}_{1:L})$, given the first $L$ steps of system input/output. Using the recognition model yields a significantly improved latent state initialization and therefore decreases the initial state uncertainty and the initial transient behavior.
  • Figure 4: Free simulation results for the benchmark methods on the Drives test dataset. The true, observed system output (blue) is compared to the individual model's predictive output distribution (orange, mean +/- two std). Results are presented for the one-step-ahead models GP-NARX and NIGP in the left column. REVARB and MSGP (shown in the middle column) are both based on multi-step optimized autoregressive GP models in latent space. In the right column, the SS-GP-SSMs, as a model based on a Markovian latent state, is compared to the proposed PR-SSM.
  • Figure 5: Results on the Sarcos large scale task: Predictions from the GP-NARX baseline (red) and the PR-SSM (orange) for two out of seven joint positions. The ground truth, measured joint positions are shown in blue. PR-SSM clearly improves over the GP-NARX predictions. Similar results are obtained for PR-SSM on the remaining 5 joints, where the GP-NARX model fails completely (cf. supplementary materials for details).
  • ...and 2 more figures