Table of Contents
Fetching ...

Stoch-IDENT: New Method and Mathematical Analysis for Identifying SPDEs from Data

Jianbo Cui, Roy Y. He

Abstract

In this paper, we propose Stoch-IDENT, a novel framework for identifying stochastic partial differential equations (SPDEs) from observational data. Our method can handle linear and nonlinear high-order SPDEs driven by time-dependent Wiener processes, accommodating both additive and multiplicative noise structures. To investigate the identifiability of SPDEs from trajectory data, we analyze the spectral properties of the solution's mean and covariance for linear SPDEs with constant coefficients, as well as the dimension of the solution space for parabolic and hyperbolic types, generalizing the identifiability theory for deterministic PDEs. Algorithmically, the drift term is identified via a sample-mean generalization of existing methods for PDE identification. For the diffusion term, we formulate a sparse regression problem with quadratic measurements induced from drift residuals and feature covariances. To address this challenging non-convex and non-smooth optimization, we develop a new greedy algorithm, Quadratic Subspace Pursuit (QSP), and prove that QSP enjoys stable support recovery under certain conditions. We validate Stoch-IDENT on various SPDEs, demonstrating its effectiveness through quantitative and qualitative evaluations.

Stoch-IDENT: New Method and Mathematical Analysis for Identifying SPDEs from Data

Abstract

In this paper, we propose Stoch-IDENT, a novel framework for identifying stochastic partial differential equations (SPDEs) from observational data. Our method can handle linear and nonlinear high-order SPDEs driven by time-dependent Wiener processes, accommodating both additive and multiplicative noise structures. To investigate the identifiability of SPDEs from trajectory data, we analyze the spectral properties of the solution's mean and covariance for linear SPDEs with constant coefficients, as well as the dimension of the solution space for parabolic and hyperbolic types, generalizing the identifiability theory for deterministic PDEs. Algorithmically, the drift term is identified via a sample-mean generalization of existing methods for PDE identification. For the diffusion term, we formulate a sparse regression problem with quadratic measurements induced from drift residuals and feature covariances. To address this challenging non-convex and non-smooth optimization, we develop a new greedy algorithm, Quadratic Subspace Pursuit (QSP), and prove that QSP enjoys stable support recovery under certain conditions. We validate Stoch-IDENT on various SPDEs, demonstrating its effectiveness through quantitative and qualitative evaluations.

Paper Structure

This paper contains 38 sections, 9 theorems, 143 equations, 4 figures, 4 tables, 2 algorithms.

Key Result

Proposition 2.1

Suppose $\mathbf a^*=(a^*_1,\cdots, a^*_K)$ and $\mathbf b^*=(b^*_1,\cdots, b^*_J)$ are the true drift and diffusion coefficient vectors of the underlying SPDE, respectively. Assume that $\sup_{(t,x)\in [0,T]\times \mathcal{D}} \mathbb E [|F_k(t,x)|^2]<+\infty$ and $\sup_{(t,x)\in [0,T]\times \mathc $\blacktriangleleft$$\blacktriangleleft$

Figures (4)

  • Figure 1: Workflow of the proposed Stoch-IDENT for identifying SPDEs from trajectory data.
  • Figure 2: Identification performance versus the number of trajectories for (a) the stochastic transport equation \ref{['eq_example_transport']}, (b) the stochastic KdV equation \ref{['eq_example_kdv']}, and (c) the stochastic Burgers equation \ref{['eq_example_burgers']}. Performance is evaluated using precision, recall, and relative in-sample and out-of-sample coefficient errors. For each choice of number of trajectories, results are averaged over 100 independent experiments with shaded regions indicating one standard deviation.
  • Figure 3: Numerical verification of the identifiability theory developed in Section \ref{['sec_identifiability']}. According to Theorem \ref{['theorem-diffusion-identification']} and Proposition \ref{['hyper-dimension']}, the solution dimension of the stochastic heat equation is generally lower than that of the stochastic transport equation. Using \ref{['eq:ident-transport']} and \ref{['eq:ident-heat']} as test models, (a) shows the F1 scores of the drift and diffuse identification performance as the number of trajectories increases, and (b) shows the spectrum of the averaged trajectory. For the stochastic transport equation, drift recovery is much easier than for the stochastic heat equation, which is further supported by the richer spectral content of the transport solution compared to the heat solution. On the other hand, diffuse identification is considerably harder for the transport equation than for the heat equation, due to the fact that the transport equation accumulates noise during evolution.
  • Figure 4: Comparison between a solution path of stochastic Allen-Cahn \ref{['eq_allen_cahn']} at (a) $t=8\times 10^{-3}$ (b) $t=2\times 10^{-2}$, and (c) $t=8\times 10^{-2}$ with the simulation by the identified SPDE \ref{['eq_allen_cahn_simulation']} at (d)-(f) the same time points. The identification is based on $20$ trajectories sampled on a coarse grid, and the simulated dynamics exhibit a similar pattern formation to that of the true model.

Theorems & Definitions (25)

  • Proposition 2.1
  • Proposition 3.1
  • Theorem 3.2
  • proof
  • Theorem 3.3
  • proof
  • Proposition 3.4
  • proof
  • Theorem 4.1
  • proof
  • ...and 15 more