Table of Contents
Fetching ...

Learning Chaotic Systems and Long-Term Predictions with Neural Jump ODEs

Florian Krach, Josef Teichmann

TL;DR

This work extends the Path-dependent Neural Jump ODE (PD-NJ-ODE) to enable provably long-term predictions for both chaotic deterministic systems and general stochastic processes observed irregularly or incompletely. It introduces two independent enhancements: (i) a training strategy based on input skipping to force the model to forecast further into the future, and (ii) the use of output feedback to stabilize training, both preserving convergence to the $L^2$-optimal predictor and applicable to a broad class of systems. Theoretical results show convergence to $\mathbb{E}[X_t|\mathcal{A}_t]$ in deterministic settings and to $\mathbb{E}[X_t|\mathcal{A}_{s\wedge t}]$ for general stochastic cases under the new training protocol. Empirically, the enhanced PD-NJ-ODE achieves near-true dynamics for a chaotic double pendulum and substantially better long-horizon predictions on geometric Brownian motion datasets, illustrating strong practical impact for learning dynamics from irregular data and for long-term forecasting in complex systems.

Abstract

The Path-dependent Neural Jump ODE (PD-NJ-ODE) is a model for online prediction of generic (possibly non-Markovian) stochastic processes with irregular (in time) and potentially incomplete (with respect to coordinates) observations. It is a model for which convergence to the $L^2$-optimal predictor, which is given by the conditional expectation, is established theoretically. Thereby, the training of the model is solely based on a dataset of realizations of the underlying stochastic process, without the need of knowledge of the law of the process. In the case where the underlying process is deterministic, the conditional expectation coincides with the process itself. Therefore, this framework can equivalently be used to learn the dynamics of ODE or PDE systems solely from realizations of the dynamical system with different initial conditions. We showcase the potential of our method by applying it to the chaotic system of a double pendulum. When training the standard PD-NJ-ODE method, we see that the prediction starts to diverge from the true path after about half of the evaluation time. In this work we enhance the model with two novel ideas, which independently of each other improve the performance of our modelling setup. The resulting dynamics match the true dynamics of the chaotic system very closely. The same enhancements can be used to provably enable the PD-NJ-ODE to learn long-term predictions for general stochastic datasets, where the standard model fails. This is verified in several experiments.

Learning Chaotic Systems and Long-Term Predictions with Neural Jump ODEs

TL;DR

This work extends the Path-dependent Neural Jump ODE (PD-NJ-ODE) to enable provably long-term predictions for both chaotic deterministic systems and general stochastic processes observed irregularly or incompletely. It introduces two independent enhancements: (i) a training strategy based on input skipping to force the model to forecast further into the future, and (ii) the use of output feedback to stabilize training, both preserving convergence to the -optimal predictor and applicable to a broad class of systems. Theoretical results show convergence to in deterministic settings and to for general stochastic cases under the new training protocol. Empirically, the enhanced PD-NJ-ODE achieves near-true dynamics for a chaotic double pendulum and substantially better long-horizon predictions on geometric Brownian motion datasets, illustrating strong practical impact for learning dynamics from irregular data and for long-term forecasting in complex systems.

Abstract

The Path-dependent Neural Jump ODE (PD-NJ-ODE) is a model for online prediction of generic (possibly non-Markovian) stochastic processes with irregular (in time) and potentially incomplete (with respect to coordinates) observations. It is a model for which convergence to the -optimal predictor, which is given by the conditional expectation, is established theoretically. Thereby, the training of the model is solely based on a dataset of realizations of the underlying stochastic process, without the need of knowledge of the law of the process. In the case where the underlying process is deterministic, the conditional expectation coincides with the process itself. Therefore, this framework can equivalently be used to learn the dynamics of ODE or PDE systems solely from realizations of the dynamical system with different initial conditions. We showcase the potential of our method by applying it to the chaotic system of a double pendulum. When training the standard PD-NJ-ODE method, we see that the prediction starts to diverge from the true path after about half of the evaluation time. In this work we enhance the model with two novel ideas, which independently of each other improve the performance of our modelling setup. The resulting dynamics match the true dynamics of the chaotic system very closely. The same enhancements can be used to provably enable the PD-NJ-ODE to learn long-term predictions for general stochastic datasets, where the standard model fails. This is verified in several experiments.
Paper Structure (21 sections, 3 theorems, 10 equations, 3 figures, 2 tables)

This paper contains 21 sections, 3 theorems, 10 equations, 3 figures, 2 tables.

Key Result

Corollary 2.1

Under the same assumptions as in andersson2024extending with the additional assumption that $X$ is deterministic given its initial value $X_0$, we denote by $\tilde{Y}^{\theta^{\min}_{m,N_m}}$ the output of the PD-NJ-ODE model, where only the fully observed initial value $X_0$ is used as input to th

Figures (3)

  • Figure 1: Left: test samples of a Double Pendulum with standard training framework (N). Right: the same test samples of the Double Pendulum with the enhanced training framework and larger dataset (N-OF-IIS-large). The conditional expectation coincides with the process, since it is deterministic.
  • Figure 2: Comparison of the standard (N; left) and enhanced (N-OF-IIS; right) model on a test sample of the BS-Base (top), BS-HighFrequ (middle) and BS-TimeDep (bottom) dataset.
  • Figure 3: A schematic representation of a double pendulum. Picture copied from DoublePendulum.

Theorems & Definitions (8)

  • Corollary 2.1
  • Remark 2.2
  • proof : Proof of \ref{['cor:convergece for deterministic system']}.
  • Corollary 2.3
  • proof
  • Proposition 2.4
  • proof
  • Remark 2.5