Table of Contents
Fetching ...

Latent SDEs on Homogeneous Spaces

Sebastian Zeng, Florian Graf, Roland Kwitt

TL;DR

The paper proposes learning latent stochastic dynamics restricted to homogeneous spaces, realized by Lie group actions, with a primary focus on SDEs on the unit sphere. By employing a one-step geometric Euler–Maruyama solver and discretize-then-optimize gradients, the approach yields efficient variational inference with a notably simple KL divergence on the sphere. Empirically, the method achieves competitive or state-of-the-art results on interpolation and per-time-point classification/regression across multiple datasets, while maintaining favorable runtime characteristics compared to more flexible neural SDEs. The work highlights a principled, geometry-friendly alternative to fully general neural SDEs, balancing model capacity and tractable training, and opens opportunities to extend to other homogeneous spaces and more advanced numerical schemes.

Abstract

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE). Motivated by the challenges that arise when trying to learn an (almost arbitrary) latent neural SDE from data, such as efficient gradient computation, we take a step back and study a specific subclass instead. In our case, the SDE evolves on a homogeneous latent space and is induced by stochastic dynamics of the corresponding (matrix) Lie group. In learning problems, SDEs on the unit n-sphere are arguably the most relevant incarnation of this setup. Notably, for variational inference, the sphere not only facilitates using a truly uninformative prior, but we also obtain a particularly simple and intuitive expression for the Kullback-Leibler divergence between the approximate posterior and prior process in the evidence lower bound. Experiments demonstrate that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step geometric Euler-Maruyama scheme. Despite restricting ourselves to a less rich class of SDEs, we achieve competitive or even state-of-the-art results on various time series interpolation/classification problems.

Latent SDEs on Homogeneous Spaces

TL;DR

The paper proposes learning latent stochastic dynamics restricted to homogeneous spaces, realized by Lie group actions, with a primary focus on SDEs on the unit sphere. By employing a one-step geometric Euler–Maruyama solver and discretize-then-optimize gradients, the approach yields efficient variational inference with a notably simple KL divergence on the sphere. Empirically, the method achieves competitive or state-of-the-art results on interpolation and per-time-point classification/regression across multiple datasets, while maintaining favorable runtime characteristics compared to more flexible neural SDEs. The work highlights a principled, geometry-friendly alternative to fully general neural SDEs, balancing model capacity and tractable training, and opens opportunities to extend to other homogeneous spaces and more advanced numerical schemes.

Abstract

We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE). Motivated by the challenges that arise when trying to learn an (almost arbitrary) latent neural SDE from data, such as efficient gradient computation, we take a step back and study a specific subclass instead. In our case, the SDE evolves on a homogeneous latent space and is induced by stochastic dynamics of the corresponding (matrix) Lie group. In learning problems, SDEs on the unit n-sphere are arguably the most relevant incarnation of this setup. Notably, for variational inference, the sphere not only facilitates using a truly uninformative prior, but we also obtain a particularly simple and intuitive expression for the Kullback-Leibler divergence between the approximate posterior and prior process in the evidence lower bound. Experiments demonstrate that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step geometric Euler-Maruyama scheme. Despite restricting ourselves to a less rich class of SDEs, we achieve competitive or even state-of-the-art results on various time series interpolation/classification problems.
Paper Structure (58 sections, 5 theorems, 42 equations, 11 figures, 10 tables, 1 algorithm)

This paper contains 58 sections, 5 theorems, 42 equations, 11 figures, 10 tables, 1 algorithm.

Key Result

Lemma 3

Let $G$ be a stochastic process that solves the matrix Itô SDE where $\mathbf{V}_0: [0,T]\to \mathbb{R}^{n \times n}, \mathbf{V}_1,\ldots,\mathbf{V}_m \in \mathbb{R}^{n \times n}$ and $w^1,\ldots,w^m$ are independent scalar-valued Wiener processes. The process $G$ stays in the quadratic matrix Lie group $\mathcal{G}(\mathbf{P})$ defined by $\mathbf P \in \mathb

Figures (11)

  • Figure 1: Sample paths from a prior (left) and posterior (right) process on the sphere. Initial values are marked by .
  • Figure 2: Recognition network types.
  • Figure 3: Latent space paths with and without label switches (on $\mathbb{S}^2$) and distribution of KL divergences.
  • Figure 4: Exemplary reconstructions on the Rotating MNIST data. Shown are the results (on one testing sequence) by integrating forward from $t=0$ (marked red) to $t=4$, i.e., three times longer than what is observed ($t \in [0,1)$) during training.
  • Figure 5: Illustration of one exemplary training time series from the (left) regression and (right) pendulum position interpolation task, respectively, from Schirmer22a. Note that the images shown here are inverted for better visualization.
  • ...and 6 more figures

Theorems & Definitions (14)

  • Definition 1: cf. Lee03a
  • Definition 2: cf. Bloch08
  • Lemma 3: cf. Brockett73a, Chirikjian12a
  • Remark
  • proof
  • Remark
  • Remark
  • Lemma 4
  • proof
  • Lemma 5
  • ...and 4 more