An evolutionary approach for discovering non-Gaussian stochastic dynamical systems based on nonlocal Kramers-Moyal formulas
Yang Li, Shengyuan Xu, Jinqiao Duan
TL;DR
This work develops an Evolutionary Symbolic Sparse Regression (ESSR) framework to recover non-Gaussian stochastic dynamics from data by uniting nonlocal Kramers-Moyal formulas, genetic programming, and elastic-net sparse regression. The method learns drift $b(x)$, diffusion $a(x)$, and Lévy jump kernel $W(y)$ (and intensity $\sigma_2$) directly from sample-path data for SDEs of the form $dx(t)=b(x(t))dt+\sigma_1(x(t))dB_t+\sigma_2 dL_t$, without heavy prior assumptions on functional forms. It builds three training datasets from short-time statistics, uses GP to generate candidate terms, and employs elastic-net to identify sparse, interpretable models; an adaptive fitness and hard-thresholding ridge regression promote parsimony. Numerical experiments on a 2D Maier-Stein system and a 3D chaotic system demonstrate accurate recovery of the Lévy kernel (e.g., $W(y)$ behaving like an $\alpha$-stable kernel with $\alpha=1.5$ or $0.5$) and the corresponding drift and diffusion terms, including complex functions such as trigonometric and rational forms. The approach offers interpretable, data-driven insights into non-Gaussian stochastic dynamics with broad applicability across physics, biology, and finance.
Abstract
Discovering explicit governing equations of stochastic dynamical systems with both (Gaussian) Brownian noise and (non-Gaussian) Lévy noise from data is chanllenging due to possible intricate functional forms and the inherent complexity of Lévy motion. This present research endeavors to develop an evolutionary symbol sparse regression (ESSR) approach to extract non-Gaussian stochastic dynamical systems from sample path data, based on nonlocal Kramers-Moyal formulas, genetic programming, and sparse regression. More specifically, the genetic programming is employed to generate a diverse array of candidate functions, the sparse regression technique aims at learning the coefficients associated with these candidates, and the nonlocal Kramers-Moyal formulas serve as the foundation for constructing the fitness measure in genetic programming and the loss function in sparse regression. The efficacy and capabilities of this approach are showcased through its application to several illustrative models. This approach stands out as a potent instrument for deciphering non-Gaussian stochastic dynamics from available datasets, indicating a wide range of applications across different fields.
