Table of Contents
Fetching ...

Enhancing Continuous Time Series Modelling with a Latent ODE-LSTM Approach

C. Coelho, M. Fernanda P. Costa, L. L. Ferrás

TL;DR

This work tackles the difficulty of modelling continuous-time series with irregular sampling by introducing a Latent ODE-LSTM encoder within a variational framework, paired with a Neural ODE decoder to produce a continuous latent trajectory. The proposed ODE-LSTM encoder alleviates vanishing gradients, while gradient clipping is applied to curb potential explosions, yielding more stable training than Latent ODE-RNN baselines. Extensive experiments on synthetic spirals and real-world data (daily climate and DJIA stocks) show improved reconstruction and competitive extrapolation, with the gradient clipping variant offering enhanced forward/backward predictive behavior. The framework advances CTS modelling by combining continuous-time latent dynamics with robust gradient flow, enabling accurate inference and extrapolation under irregular sampling conditions.

Abstract

Due to their dynamic properties such as irregular sampling rate and high-frequency sampling, Continuous Time Series (CTS) are found in many applications. Since CTS with irregular sampling rate are difficult to model with standard Recurrent Neural Networks (RNNs), RNNs have been generalised to have continuous-time hidden dynamics defined by a Neural Ordinary Differential Equation (Neural ODE), leading to the ODE-RNN model. Another approach that provides a better modelling is that of the Latent ODE model, which constructs a continuous-time model where a latent state is defined at all times. The Latent ODE model uses a standard RNN as the encoder and a Neural ODE as the decoder. However, since the RNN encoder leads to difficulties with missing data and ill-defined latent variables, a Latent ODE-RNN model has recently been proposed that uses a ODE-RNN model as the encoder instead. Both the Latent ODE and Latent ODE-RNN models are difficult to train due to the vanishing and exploding gradients problem. To overcome this problem, the main contribution of this paper is to propose and illustrate a new model based on a new Latent ODE using an ODE-LSTM (Long Short-Term Memory) network as an encoder -- the Latent ODE-LSTM model. To limit the growth of the gradients the Norm Gradient Clipping strategy was embedded on the Latent ODE-LSTM model. The performance evaluation of the new Latent ODE-LSTM (with and without Norm Gradient Clipping) for modelling CTS with regular and irregular sampling rates is then demonstrated. Numerical experiments show that the new Latent ODE-LSTM performs better than Latent ODE-RNNs and can avoid the vanishing and exploding gradients during training.

Enhancing Continuous Time Series Modelling with a Latent ODE-LSTM Approach

TL;DR

This work tackles the difficulty of modelling continuous-time series with irregular sampling by introducing a Latent ODE-LSTM encoder within a variational framework, paired with a Neural ODE decoder to produce a continuous latent trajectory. The proposed ODE-LSTM encoder alleviates vanishing gradients, while gradient clipping is applied to curb potential explosions, yielding more stable training than Latent ODE-RNN baselines. Extensive experiments on synthetic spirals and real-world data (daily climate and DJIA stocks) show improved reconstruction and competitive extrapolation, with the gradient clipping variant offering enhanced forward/backward predictive behavior. The framework advances CTS modelling by combining continuous-time latent dynamics with robust gradient flow, enabling accurate inference and extrapolation under irregular sampling conditions.

Abstract

Due to their dynamic properties such as irregular sampling rate and high-frequency sampling, Continuous Time Series (CTS) are found in many applications. Since CTS with irregular sampling rate are difficult to model with standard Recurrent Neural Networks (RNNs), RNNs have been generalised to have continuous-time hidden dynamics defined by a Neural Ordinary Differential Equation (Neural ODE), leading to the ODE-RNN model. Another approach that provides a better modelling is that of the Latent ODE model, which constructs a continuous-time model where a latent state is defined at all times. The Latent ODE model uses a standard RNN as the encoder and a Neural ODE as the decoder. However, since the RNN encoder leads to difficulties with missing data and ill-defined latent variables, a Latent ODE-RNN model has recently been proposed that uses a ODE-RNN model as the encoder instead. Both the Latent ODE and Latent ODE-RNN models are difficult to train due to the vanishing and exploding gradients problem. To overcome this problem, the main contribution of this paper is to propose and illustrate a new model based on a new Latent ODE using an ODE-LSTM (Long Short-Term Memory) network as an encoder -- the Latent ODE-LSTM model. To limit the growth of the gradients the Norm Gradient Clipping strategy was embedded on the Latent ODE-LSTM model. The performance evaluation of the new Latent ODE-LSTM (with and without Norm Gradient Clipping) for modelling CTS with regular and irregular sampling rates is then demonstrated. Numerical experiments show that the new Latent ODE-LSTM performs better than Latent ODE-RNNs and can avoid the vanishing and exploding gradients during training.
Paper Structure (31 sections, 69 equations, 12 figures, 3 tables, 2 algorithms)

This paper contains 31 sections, 69 equations, 12 figures, 3 tables, 2 algorithms.

Figures (12)

  • Figure 1: Schematic representation of a RNN applied to a sequential data of arbitrary length. The feedback loop can be unrolled over time and represented as $N$ copies of the RNN cell, resembling a deep fully-connected feed-forward NN with a depth of $N$.
  • Figure 2: Long Short-Term Memory cell scheme.
  • Figure 3: Architecture of a Neural Ordinary Differential Equation. In the training phase, the time interval given to the ODE solver is $[t_0, t_N]$ and in the testing phase, predictions can be made in an arbitrary time interval $[t_0, t_f]$.
  • Figure 4: Architecture of an Autoencoder consisting of an encoder and a decoder.
  • Figure 5: Architecture of a VAE consisting of two main components: a probabilistic encoder and a probabilistic decoder.
  • ...and 7 more figures