Table of Contents
Fetching ...

Homotopy-based training of NeuralODEs for accurate dynamics discovery

Joon-Hyuk Ko, Hankyul Koh, Nojun Park, Wonho Jhe

TL;DR

This work addresses the difficulty of training NeuralODEs on long time-series data by introducing a synchronization-based coupling plus homotopy optimization that smooths the otherwise irregular loss landscape without constraining the model architecture. By coupling the model dynamics to the data through a tunable term and gradually reducing the homotopy parameter, the method guides optimization toward better minima and improves extrapolation. Empirical results on Lotka–Volterra, double pendulum, and Lorenz systems show faster convergence, robustness to noise, and competitive or superior predictive performance compared with vanilla training and multiple shooting. The approach offers a general, architecture-agnostic framework for dynamics discovery with potential applicability to high-dimensional and latent-ODE settings.

Abstract

Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature imposes strong constraints to the NeuralODE architecture to inherently promote stable model dynamics, such methods are ill-suited for dynamics discovery as the unknown governing equation is not guaranteed to satisfy the assumed constraints. In this paper, we develop a new training method for NeuralODEs, based on synchronization and homotopy optimization, that does not require changes to the model architecture. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape, which homotopy optimization can then leverage to enhance training. Through benchmark experiments, we demonstrate our method achieves competitive or better training loss while often requiring less than half the number of training epochs compared to other model-agnostic techniques. Furthermore, models trained with our method display better extrapolation capabilities, highlighting the effectiveness of our method.

Homotopy-based training of NeuralODEs for accurate dynamics discovery

TL;DR

This work addresses the difficulty of training NeuralODEs on long time-series data by introducing a synchronization-based coupling plus homotopy optimization that smooths the otherwise irregular loss landscape without constraining the model architecture. By coupling the model dynamics to the data through a tunable term and gradually reducing the homotopy parameter, the method guides optimization toward better minima and improves extrapolation. Empirical results on Lotka–Volterra, double pendulum, and Lorenz systems show faster convergence, robustness to noise, and competitive or superior predictive performance compared with vanilla training and multiple shooting. The approach offers a general, architecture-agnostic framework for dynamics discovery with potential applicability to high-dimensional and latent-ODE settings.

Abstract

Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature imposes strong constraints to the NeuralODE architecture to inherently promote stable model dynamics, such methods are ill-suited for dynamics discovery as the unknown governing equation is not guaranteed to satisfy the assumed constraints. In this paper, we develop a new training method for NeuralODEs, based on synchronization and homotopy optimization, that does not require changes to the model architecture. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape, which homotopy optimization can then leverage to enhance training. Through benchmark experiments, we demonstrate our method achieves competitive or better training loss while often requiring less than half the number of training epochs compared to other model-agnostic techniques. Furthermore, models trained with our method display better extrapolation capabilities, highlighting the effectiveness of our method.
Paper Structure (66 sections, 1 theorem, 18 equations, 21 figures, 6 tables, 1 algorithm)

This paper contains 66 sections, 1 theorem, 18 equations, 21 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Assuming $\mathbf{U} = \hat{\mathbf{U}}$, $\boldsymbol{\theta} \approx \boldsymbol{\hat{\theta}}$, and $\mathbf{u}(t)$ in the vicinity of $\hat{\mathbf{u}}(t)$, the elements of $\mathbf{K}$ can be chosen so that the two states synchronize: that is, for error dynamics $\boldsymbol\xi(t) = \mathbf{u}(

Figures (21)

  • Figure 1: Optimization trajectories for our homotopy method (left) and vanilla gradient descent (right). While convention training meanders on the pathological loss landscape, our method provides a series of relaxed landscapes that effectively guide the optimizer to the loss minimum.
  • Figure 2: Irregular loss landscape in NeuralODE training. (Upper left) Loss landscape and (Lower left) eigenvalue spectrum of the loss Hessian for increasing lengths of train data. (Upper right) Loss landscape and (Lower right) eigenvalue spectrum of the loss Hessian with increasing coupling strength. For clarity, loss values were clipped above at 4000. All Hessian related information was calculated using the PyHessian package yao2020a.
  • Figure 3: Results for the Lotka-Volterra system. (First) Interpolation and extrapolation errors for increasing data length. (Second) Errors for the mid-length data with decreasing model capacity. (Third) Errors for fixed time interval but with increasing data sparsity. (Fourth) Errors for mid-length data with increasing noise.
  • Figure 4: Performance benchmarks. (Left) Interpolation and (Center) extrapolation MSEs for each training method. (Right) Epochs where the minimum interpolation MSE was achieved. For all plots, the bar and the error bars denote the mean and standard errors across three runs.
  • Figure 5: Second-order NeuralODE training results for the double pendulum dataset. (Left) Predicted trajectories. Lines and bands correspond to mean and standard errors for three training runs and the dashed line indicate the extrapolation start time. (Right) Estimation of the hessian trace during training. The homotopy curve stops early due to the difference in the number of maximum training curves used.
  • ...and 16 more figures

Theorems & Definitions (3)

  • Theorem 1: Synchronization
  • proof
  • Remark