Latent Space Energy-based Neural ODEs

Sheng Cheng; Deqian Kong; Jianwen Xie; Kookjin Lee; Ying Nian Wu; Yezhou Yang

Latent Space Energy-based Neural ODEs

Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang

TL;DR

The paper introduces Latent Space Energy-based Neural ODE (ODE-LEBM), a continuous-time sequence model that combines a neural ODE generator with an expressive energy-based prior over the initial latent state $z_{t_0}$ and an emission model. It tackles inference and prior design by using MCMC-based posterior and prior sampling for maximum likelihood training, avoiding variational inference and inference networks. The model supports disentangling trajectory-specific static $z_s$ and dynamic $z_d$ latent variables to improve interpretability and generalization across dynamic regimes. Empirical results across irregular time series, Rotating MNIST, bouncing balls, and MuJoCo demonstrate superior interpolation/extrapolation, robust OOD detection via energy, and more interpretable latent representations compared to strong baselines, with ablations highlighting the necessity of the EBM prior and Langevin-based inference.

Abstract

This paper introduces novel deep dynamical models designed to represent continuous-time sequences. Our approach employs a neural emission model to generate each data point in the time series through a non-linear transformation of a latent state vector. The evolution of these latent states is implicitly defined by a neural ordinary differential equation (ODE), with the initial state drawn from an informative prior distribution parameterized by an Energy-based model (EBM). This framework is extended to disentangle dynamic states from underlying static factors of variation, represented as time-invariant variables in the latent space. We train the model using maximum likelihood estimation with Markov chain Monte Carlo (MCMC) in an end-to-end manner. Experimental results on oscillating systems, videos and real-world state sequences (MuJoCo) demonstrate that our model with the learnable energy-based prior outperforms existing counterparts, and can generalize to new dynamic parameterization, enabling long-horizon predictions.

Latent Space Energy-based Neural ODEs

TL;DR

and an emission model. It tackles inference and prior design by using MCMC-based posterior and prior sampling for maximum likelihood training, avoiding variational inference and inference networks. The model supports disentangling trajectory-specific static

and dynamic

latent variables to improve interpretability and generalization across dynamic regimes. Empirical results across irregular time series, Rotating MNIST, bouncing balls, and MuJoCo demonstrate superior interpolation/extrapolation, robust OOD detection via energy, and more interpretable latent representations compared to strong baselines, with ablations highlighting the necessity of the EBM prior and Langevin-based inference.

Abstract

Paper Structure (36 sections, 15 equations, 9 figures, 11 tables, 2 algorithms)

This paper contains 36 sections, 15 equations, 9 figures, 11 tables, 2 algorithms.

Introduction
Related Work
Neural ODEs
Latent space energy-based model
Latent Space Energy-based Neural ODE
Experiments
Baseline methods
Implementation details
Overview
Irregularly-sampled time series
Rotating MNIST
Bouncing balls with friction
MuJoCo physics simulation
Irregularly-Sampled Time Series
Disentangle Trajectory-specific Latent Variables
...and 21 more sections

Figures (9)

Figure 1: An illustration of ODE-LEBM. The initial latent state $z_{t_0}$ follows a learnable EBM prior distribution $p_\alpha(z_{t_0})$ (\ref{['eq:prior']}). Subsequent latent states $(z_{t_1},\dots, z_{t_T})$ are generated using a neural ODE (\ref{['eq:ode']}). All latent states are then mapped to the data space through an emission model (\ref{['eq:emission']}).
Figure 2: Interpolation results on irregularly-sampled time series. The latent initial state is sampled based on partial observations, $z_{t_0}\sim p_\theta(z_{t_0}|{\mathbf{x}})$, and then used to predict the entire sequence. For comparison, we also show the results of directly sampling from the learned latent EBM prior, $z_{t_0}\sim p_\alpha(z_{t_0})$ in the last column.
Figure 3: Interpolation and extrapolation results on the rotating-MNIST test set. The first $15$ steps represent interpolation, while the last $30$ steps represent extrapolation. The first row shows the model predictions for each digit, and the second row presents the ground-truth observations.
Figure 4: PCA embeddings of both $z_{t_0}$ and $z_s$ for the Rotating MNIST dataset, as inferred by posterior sampling. We generate 16 trajectories from a single digit, incrementing the initial angle of each trajectory by $24^\circ$, starting from $0^\circ$ until $360^\circ$. In (a) and (c), circles denote the start of the trajectory (the initial angle), and lines represent the ODE trajectory. The color gradient corresponds to the initial angle of the trajectory in the observation space. (a) and (b) are from the same sequences with time reset, while (c) and (d) are from the same sequences without time reset.
Figure 5: t-SNE plot of static variable $z_s$ with randomly sampled sequences.
...and 4 more figures

Latent Space Energy-based Neural ODEs

TL;DR

Abstract

Latent Space Energy-based Neural ODEs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)