An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

Mingtao Xia; Xiangting Li; Qijing Shen; Tom Chou

An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

Mingtao Xia, Xiangting Li, Qijing Shen, Tom Chou

TL;DR

This work tackles the inverse problem of reconstructing multidimensional jump-diffusion processes from data by leveraging Wasserstein-distance based losses. It introduces a temporally decoupled squared $W_2$ distance, $\tilde{W}_2^2(\mu, \hat{\mu}) = \int_0^T W_2^2(\mu(t), \hat{\mu}(t)) dt$, which is efficiently computable from finite-sample trajectories and provides both upper and lower bounds on reconstruction errors of the drift $\bm{f}$, diffusion $\bm{\sigma}$, and jump $\bm{\beta}$ when approximating the true process with a neural-network parameterized model. Theoretical results establish that $W_p(\mu,\hat{\mu})$ lower-bounds the aggregate coefficient discrepancies while the temporally decoupled form offers practical finite-sample estimability, with well-defined finite-time projections ensuring convergence properties. Numerical experiments demonstrate that minimizing the temporally decoupled $W_2$ loss yields more accurate reconstructions than standard losses (e.g., MSE, MMD, $W_1$, $W_2$, WGAN), and that incorporating prior information on the drift can substantially improve diffusion and jump-function recovery. The approach holds promise for efficient, accurate inference of complex stochastic systems in finance, biology, and beyond, while pointing to future work on higher dimensions and more general noise types.

Abstract

We analyze the Wasserstein distance ($W$-distance) between two probability distributions associated with two multidimensional jump-diffusion processes. Specifically, we analyze a temporally decoupled squared $W_2$-distance, which provides both upper and lower bounds associated with the discrepancies in the drift, diffusion, and jump amplitude functions between the two jump-diffusion processes. Then, we propose a temporally decoupled squared $W_2$-distance method for efficiently reconstructing unknown jump-diffusion processes from data using parameterized neural networks. We further show its performance can be enhanced by utilizing prior information on the drift function of the jump-diffusion process. The effectiveness of our proposed reconstruction method is demonstrated across several examples and applications.

An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

TL;DR

distance,

, which is efficiently computable from finite-sample trajectories and provides both upper and lower bounds on reconstruction errors of the drift

, diffusion

, and jump

when approximating the true process with a neural-network parameterized model. Theoretical results establish that

lower-bounds the aggregate coefficient discrepancies while the temporally decoupled form offers practical finite-sample estimability, with well-defined finite-time projections ensuring convergence properties. Numerical experiments demonstrate that minimizing the temporally decoupled

loss yields more accurate reconstructions than standard losses (e.g., MSE, MMD,

, WGAN), and that incorporating prior information on the drift can substantially improve diffusion and jump-function recovery. The approach holds promise for efficient, accurate inference of complex stochastic systems in finance, biology, and beyond, while pointing to future work on higher dimensions and more general noise types.

Abstract

We analyze the Wasserstein distance (

-distance) between two probability distributions associated with two multidimensional jump-diffusion processes. Specifically, we analyze a temporally decoupled squared

-distance, which provides both upper and lower bounds associated with the discrepancies in the drift, diffusion, and jump amplitude functions between the two jump-diffusion processes. Then, we propose a temporally decoupled squared

-distance method for efficiently reconstructing unknown jump-diffusion processes from data using parameterized neural networks. We further show its performance can be enhanced by utilizing prior information on the drift function of the jump-diffusion process. The effectiveness of our proposed reconstruction method is demonstrated across several examples and applications.

Paper Structure (17 sections, 6 theorems, 116 equations, 6 figures, 4 tables)

This paper contains 17 sections, 6 theorems, 116 equations, 6 figures, 4 tables.

Introduction
Contribution
Organization
The $W$-distance between the probability measures associated with the jump-diffusion processes in Eqs. \ref{['model_equation']} and \ref{['approximate_equation']}
A temporally decoupled squared $W_2$ distance
Numerical experiments
Summary & conclusions
Proof to Theorem \ref{['theorem1']}
Proof to Theorem \ref{['theorem3']}
Proof to Theorem \ref{['theorem4']}
Proof of Theorem \ref{['theorem5']}
Default training settings
Definitions of different loss metrics
Varying the coefficients that determine diffusion and jump functions
Reconstructing Eq. \ref{['example2_model']} in Example \ref{['example2']} with different numbers of trajectories in the training set
...and 2 more sections

Key Result

Theorem 2.1

Suppose $\bm{X}(t)$ and $\hat{\bm{X}}(t)$ are two $d$-dimensional jump-diffusion processes that are determined by Eq. model_equation and Eq. approximate_equation. We denote and assume that are martingales for all $i, j$. Then, the following inequality holds: where $|\cdot|_2$ denotes the $2$-norm of a $d$-dimensional vector, $\bm{X}(0)$ is the initial condition, and $H(t)$ is defined as

Figures (6)

Figure 1: Reconstruction of trajectories and model functions. We define ground truth as $b=4, a=-1, \sigma_0=0.4, y_0=1$ in Eq. \ref{['example1_numerical']}, with $T=20.2$ and initial condition $X_0=2$. (a-f) ground truth (black) and reconstructed trajectories (red) generated from the learned jump-diffusion process by minimizing different loss functions or using different methods. (g) The reconstruction errors of the drift, diffusion, and jump functions defined in Eqs. \ref{['drift_error']}, \ref{['diffusion_error']}, and \ref{['jump_error']}. We compare errors from minimizing our temporally decoupled squared $W_2$-distance versus those from minimizing the MSE, MMD, mean$^2$+var, the $W_1$-distance $W_1(\mu, \hat{\mu})$, the squared $W_2$-distance $W_2^2(\mu_N, \hat{\mu}_N)$, and the error of results obtained using the WGAN method. The mean and standard deviation of the error for different methods are obtained by repeating the experiment 10 times. (h) The reconstruction errors in the drift, diffusion, and jump functions defined in Eqs. \ref{['drift_error']}, \ref{['diffusion_error']}, and \ref{['jump_error']} w.r.t. the standard deviation $\delta$ of the initial condition (Eq. \ref{['IC_noise']}).
Figure 2: (a) The trajectories generated by the ground truth (black) jump-diffusion process with $\sigma(X, t)\equiv 0.1\sqrt{|X|}~ \text{and}~\beta(X, t)\equiv 0.1\sqrt{|X|}$ and given drift function in Eq. \ref{['example2_model']}, plotted against reconstructed trajectories (red) using the same drift function prior. (b-c) The ground truth diffusion and jump functions $\sigma(X,t)\equiv\sigma_0\sqrt{|X|}$ and $\beta(X, t)\equiv\beta_0\sqrt{|X|}$) shown against the reconstructed functions $\hat{\sigma}(X, t)$ and $\hat{\beta}(X, t)$. (with drift function given as prior). The red curves are the mean $\hat{\sigma}(X, t)$ and $\hat{\beta}(X, t)$ while the shaded bands show their standard deviations, calculated over 5 independent experiments). (d-k) The reconstruction errors of the drift, diffusion, and jump functions without prior information on Eq. \ref{['example2_model']} or with one of the drift, diffusion, and jump functions given. When the drift function is given, errors in the reconstructed diffusion and jump functions are the smallest in all cases (error bars under "drift prior.")
Figure 3: (a-b) Solutions generated by the reconstructed jump-diffusion process using our temporally decoupled squared $W_2$ method versus solutions generated by the ground truth Eq. \ref{['example3_model']}. (c) The reconstructed $\hat{\bm{X}}(t=10)$ versus the ground truth $\bm{X}(t=10)$. In (a-c), $c_1=c_2=-0.5$. (d) The error (Eq. \ref{['sigma_error']}) between the ground truth diffusion function $\bm{\sigma}$ and the reconstructed diffusion function $\hat{\bm{\sigma}}$. (e) The error (Eq. \ref{['beta_error']}) between the ground truth jump function $\bm{\beta}$ and the reconstructed diffusion function $\hat{\bm{\beta}}$. In (d-e), the errors are averaged over 5 independent experiments.
Figure 4: (a) The temporally decoupled squared Wasserstein distance $\tilde{W}_2^2(\mu_N, \hat{\mu}_N)$. (b) the average relative errors in the reconstructed drift function $\hat{f}(x)$; (c) the average relative errors in the reconstructed diffusion function $\hat{\sigma}(x)$; (d) the average relative errors in the reconstructed jump functions $\hat{\beta}(x)$.
Figure 5: The reconstruction errors in the drift, diffusion, and jump functions defined in Eqs. \ref{['diffusion_error']} and \ref{['jump_error']} as a function of the number of trajectories $M_s$ when different prior information is provided. The results are averaged over 5 independent experiments. Training hyperparameters are the same as those used in Example \ref{['example2']} listed in Table \ref{['tab:setting']}.
...and 1 more figures

Theorems & Definitions (12)

Definition 2.1
Theorem 2.1
Corollary 2.1
proof
Theorem 2.2
Theorem 3.1
Theorem 3.2
Theorem 3.3
proof
Example 4.1
...and 2 more

An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

TL;DR

Abstract

An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (12)