Latent Space Energy-based Neural ODEs
Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang
TL;DR
The paper introduces Latent Space Energy-based Neural ODE (ODE-LEBM), a continuous-time sequence model that combines a neural ODE generator with an expressive energy-based prior over the initial latent state $z_{t_0}$ and an emission model. It tackles inference and prior design by using MCMC-based posterior and prior sampling for maximum likelihood training, avoiding variational inference and inference networks. The model supports disentangling trajectory-specific static $z_s$ and dynamic $z_d$ latent variables to improve interpretability and generalization across dynamic regimes. Empirical results across irregular time series, Rotating MNIST, bouncing balls, and MuJoCo demonstrate superior interpolation/extrapolation, robust OOD detection via energy, and more interpretable latent representations compared to strong baselines, with ablations highlighting the necessity of the EBM prior and Langevin-based inference.
Abstract
This paper introduces novel deep dynamical models designed to represent continuous-time sequences. Our approach employs a neural emission model to generate each data point in the time series through a non-linear transformation of a latent state vector. The evolution of these latent states is implicitly defined by a neural ordinary differential equation (ODE), with the initial state drawn from an informative prior distribution parameterized by an Energy-based model (EBM). This framework is extended to disentangle dynamic states from underlying static factors of variation, represented as time-invariant variables in the latent space. We train the model using maximum likelihood estimation with Markov chain Monte Carlo (MCMC) in an end-to-end manner. Experimental results on oscillating systems, videos and real-world state sequences (MuJoCo) demonstrate that our model with the learnable energy-based prior outperforms existing counterparts, and can generalize to new dynamic parameterization, enabling long-horizon predictions.
