Jacobian Regularization Stabilizes Long-Term Integration of Neural Differential Equations
Maya Janvier, Julien Salomon, Etienne Meunier
TL;DR
This paper tackles the stability challenge of long-term integration with Neural Differential Equations by introducing Jacobian-based regularizations that align the learned dynamics with the true system through directional derivatives. It presents two cost-effective losses: an exact directional-derivative loss for known dynamics ($\mathcal{L}_{AD}$) and a finite-difference, unsupervised loss for unknown dynamics ($\mathcal{L}_{FD}$), both leveraging a Hutchinson trace estimator to avoid full Jacobian computations. The approach demonstrates improved long-term stability across two ODE problems (Two-Body and Rigid Body) and one PDE (Kuramoto-Sivashinsky), with distinct strengths: AD excels when Jacobians are tractable, while FD offers robust performance with unknown dynamics and easier tuning. This work enables stable, long-range simulations using neural approximations of dynamical systems without resorting to expensive long training rollouts, thereby broadening applicability to large-scale physical models. $L_F$ and $L_{F_\theta}$ bounds are used to motivate the regularization of $J_F$ toward $J_{F_\theta}$, linking Jacobian accuracy to trajectory stability.$
Abstract
Hybrid models and Neural Differential Equations (NDE) are getting increasingly important for the modeling of physical systems, however they often encounter stability and accuracy issues during long-term integration. Training on unrolled trajectories is known to limit these divergences but quickly becomes too expensive due to the need for computing gradients over an iterative process. In this paper, we demonstrate that regularizing the Jacobian of the NDE model via its directional derivatives during training stabilizes long-term integration in the challenging context of short training rollouts. We design two regularizations, one for the case of known dynamics where we can directly derive the directional derivatives of the dynamic and one for the case of unknown dynamics where they are approximated using finite differences. Both methods, while having a far lower cost compared to long rollouts during training, are successful in improving the stability of long-term simulations for several ordinary and partial differential equations, opening up the door to training NDE methods for long-term integration of large scale systems.
