DynGMA: a robust approach for learning stochastic differential equations from data
Aiqing Zhu, Qianxiao Li
TL;DR
This paper tackles learning fully unknown drift $f$ and diffusion $\sigma$ in stochastic differential equations from discrete trajectories. It introduces DynGMA, a dynamical Gaussian mixture density that leverages multi-step Gaussian approximations and cubature to enable likelihood-based training with moderate-to-large time steps and variable sampling intervals, while supporting invariant-distribution computation. The approach provides theoretical error nuances via an asymptotic Gaussian density expansion and practical neural-network parameterizations for $f_\theta$ and $\sigma_\theta$, yielding accurate reconstructions across low- and high-dimensional SDEs, as well as data from Gillespie stochastic simulations. The method demonstrates robustness to measurement noise, scalability to 10D systems, and improved estimation of invariant structures, with potential for integrating partial physics-informed terms and extending to non-Gaussian noise in future work.
Abstract
Learning unknown stochastic differential equations (SDEs) from observed data is a significant and challenging task with applications in various fields. Current approaches often use neural networks to represent drift and diffusion functions, and construct likelihood-based loss by approximating the transition density to train these networks. However, these methods often rely on one-step stochastic numerical schemes, necessitating data with sufficiently high time resolution. In this paper, we introduce novel approximations to the transition density of the parameterized SDE: a Gaussian density approximation inspired by the random perturbation theory of dynamical systems, and its extension, the dynamical Gaussian mixture approximation (DynGMA). Benefiting from the robust density approximation, our method exhibits superior accuracy compared to baseline methods in learning the fully unknown drift and diffusion functions and computing the invariant distribution from trajectory data. And it is capable of handling trajectory data with low time resolution and variable, even uncontrollable, time step sizes, such as data generated from Gillespie's stochastic simulations. We then conduct several experiments across various scenarios to verify the advantages and robustness of the proposed method.
