Table of Contents
Fetching ...

An evolutionary approach for discovering non-Gaussian stochastic dynamical systems based on nonlocal Kramers-Moyal formulas

Yang Li, Shengyuan Xu, Jinqiao Duan

TL;DR

This work develops an Evolutionary Symbolic Sparse Regression (ESSR) framework to recover non-Gaussian stochastic dynamics from data by uniting nonlocal Kramers-Moyal formulas, genetic programming, and elastic-net sparse regression. The method learns drift $b(x)$, diffusion $a(x)$, and Lévy jump kernel $W(y)$ (and intensity $\sigma_2$) directly from sample-path data for SDEs of the form $dx(t)=b(x(t))dt+\sigma_1(x(t))dB_t+\sigma_2 dL_t$, without heavy prior assumptions on functional forms. It builds three training datasets from short-time statistics, uses GP to generate candidate terms, and employs elastic-net to identify sparse, interpretable models; an adaptive fitness and hard-thresholding ridge regression promote parsimony. Numerical experiments on a 2D Maier-Stein system and a 3D chaotic system demonstrate accurate recovery of the Lévy kernel (e.g., $W(y)$ behaving like an $\alpha$-stable kernel with $\alpha=1.5$ or $0.5$) and the corresponding drift and diffusion terms, including complex functions such as trigonometric and rational forms. The approach offers interpretable, data-driven insights into non-Gaussian stochastic dynamics with broad applicability across physics, biology, and finance.

Abstract

Discovering explicit governing equations of stochastic dynamical systems with both (Gaussian) Brownian noise and (non-Gaussian) Lévy noise from data is chanllenging due to possible intricate functional forms and the inherent complexity of Lévy motion. This present research endeavors to develop an evolutionary symbol sparse regression (ESSR) approach to extract non-Gaussian stochastic dynamical systems from sample path data, based on nonlocal Kramers-Moyal formulas, genetic programming, and sparse regression. More specifically, the genetic programming is employed to generate a diverse array of candidate functions, the sparse regression technique aims at learning the coefficients associated with these candidates, and the nonlocal Kramers-Moyal formulas serve as the foundation for constructing the fitness measure in genetic programming and the loss function in sparse regression. The efficacy and capabilities of this approach are showcased through its application to several illustrative models. This approach stands out as a potent instrument for deciphering non-Gaussian stochastic dynamics from available datasets, indicating a wide range of applications across different fields.

An evolutionary approach for discovering non-Gaussian stochastic dynamical systems based on nonlocal Kramers-Moyal formulas

TL;DR

This work develops an Evolutionary Symbolic Sparse Regression (ESSR) framework to recover non-Gaussian stochastic dynamics from data by uniting nonlocal Kramers-Moyal formulas, genetic programming, and elastic-net sparse regression. The method learns drift , diffusion , and Lévy jump kernel (and intensity ) directly from sample-path data for SDEs of the form , without heavy prior assumptions on functional forms. It builds three training datasets from short-time statistics, uses GP to generate candidate terms, and employs elastic-net to identify sparse, interpretable models; an adaptive fitness and hard-thresholding ridge regression promote parsimony. Numerical experiments on a 2D Maier-Stein system and a 3D chaotic system demonstrate accurate recovery of the Lévy kernel (e.g., behaving like an -stable kernel with or ) and the corresponding drift and diffusion terms, including complex functions such as trigonometric and rational forms. The approach offers interpretable, data-driven insights into non-Gaussian stochastic dynamics with broad applicability across physics, biology, and finance.

Abstract

Discovering explicit governing equations of stochastic dynamical systems with both (Gaussian) Brownian noise and (non-Gaussian) Lévy noise from data is chanllenging due to possible intricate functional forms and the inherent complexity of Lévy motion. This present research endeavors to develop an evolutionary symbol sparse regression (ESSR) approach to extract non-Gaussian stochastic dynamical systems from sample path data, based on nonlocal Kramers-Moyal formulas, genetic programming, and sparse regression. More specifically, the genetic programming is employed to generate a diverse array of candidate functions, the sparse regression technique aims at learning the coefficients associated with these candidates, and the nonlocal Kramers-Moyal formulas serve as the foundation for constructing the fitness measure in genetic programming and the loss function in sparse regression. The efficacy and capabilities of this approach are showcased through its application to several illustrative models. This approach stands out as a potent instrument for deciphering non-Gaussian stochastic dynamics from available datasets, indicating a wide range of applications across different fields.
Paper Structure (16 sections, 57 equations, 9 figures, 1 table)

This paper contains 16 sections, 57 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Schematic diagram of the basic structure of the ESSR method.
  • Figure 2: Framework of genetic programming. Gold ellipse denotes population of genetic programming and $G$ indicates the generation. Blue box represents individle within the population, which is constituted by several candidate functions indicated by purple circle. The candidate functions are mathematical expressions, composed of the operators in the function set and variables in the terminal set.
  • Figure 3: Genetic operators of genetic programming. (a) Crossover. Two parental candidate functions, $\sin x_1 + x_2 x_3$ and $1/(x_1+c)$, produce two offspring trees $\sin x_1 + 1$ and $x_2 x_3/(x_1+c)$ after crossover operation. (b) Mutation for subtree. One parental expression $1/(x_1+c_1)$ produces an offspring tree $\ln x_1 /(x_1+c_1)$ after mutation for subtree. (c) Mutation for constant. One parental expression $1/(x_1+c_1)$ produces an offspring tree $1 /(x_1+c_2)$ after mutation for constant.
  • Figure 4: The learned results of Lévy jump measure of stochastic Maier-Stein system. (a) The optimal individle of the jump measure after performing the algorithm. The constant $c$ in the tree structure is 2.0368. (b) The mean squared loss functions for current best inidividle and the best inidividle so far and the fitness measure during the iterating process. (c) The candidate number and total nodes in the current best inidividle during the iterating process.
  • Figure 5: The learned results of drift coefficient of stochastic Maier-Stein system. (a) The optimal individle of the drift coefficient after performing the algorithm. (b) The mean squared loss functions for current best inidividle and the best inidividle so far and the fitness measure during the iterating process. (c) The candidate number and total nodes in the current best inidividle during the iterating process.
  • ...and 4 more figures