Homotopy-based training of NeuralODEs for accurate dynamics discovery
Joon-Hyuk Ko, Hankyul Koh, Nojun Park, Wonho Jhe
TL;DR
This work addresses the difficulty of training NeuralODEs on long time-series data by introducing a synchronization-based coupling plus homotopy optimization that smooths the otherwise irregular loss landscape without constraining the model architecture. By coupling the model dynamics to the data through a tunable term and gradually reducing the homotopy parameter, the method guides optimization toward better minima and improves extrapolation. Empirical results on Lotka–Volterra, double pendulum, and Lorenz systems show faster convergence, robustness to noise, and competitive or superior predictive performance compared with vanilla training and multiple shooting. The approach offers a general, architecture-agnostic framework for dynamics discovery with potential applicability to high-dimensional and latent-ODE settings.
Abstract
Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature imposes strong constraints to the NeuralODE architecture to inherently promote stable model dynamics, such methods are ill-suited for dynamics discovery as the unknown governing equation is not guaranteed to satisfy the assumed constraints. In this paper, we develop a new training method for NeuralODEs, based on synchronization and homotopy optimization, that does not require changes to the model architecture. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape, which homotopy optimization can then leverage to enhance training. Through benchmark experiments, we demonstrate our method achieves competitive or better training loss while often requiring less than half the number of training epochs compared to other model-agnostic techniques. Furthermore, models trained with our method display better extrapolation capabilities, highlighting the effectiveness of our method.
