Table of Contents
Fetching ...

Convergence proofs and strong error bounds for forward-backward stochastic differential equations using neural network simulations

Oliver Sheridan-Methven

TL;DR

The paper tackles the challenge of solving high-dimensional forward-backward stochastic differential equations (FBSDEs) by marrying neural network approximations with multilevel Monte Carlo (MLMC) techniques, leveraging the Feynman–Kac link to PDEs. It provides rigorous strong error bounds for both decoupled and coupled forward-backward systems when neural networks approximate the PDE solution, and analyzes the bias and variance of the Raissi loss function, proposing a variance-reducing variant that scales with $\Delta t^{1/2}$ when Hessians are affordable. A MLMC-inspired training framework is developed to couple neural networks and discretisation levels, including offline/online training regimes and nested-decomposition ideas, with numerical results validating variance structure and regime transitions. The work identifies practical directions for improving loss formulations, variance reduction, and interpolation point choices, aiming to make NN/MLMC approaches for high-dimensional FBSDEs more robust and scalable with provable guarantees.

Abstract

We introduce forward-backward stochastic differential equations, highlighting the connection between solutions of these and solutions of partial differential equations, related by the Feynman-Kac theorem. We review the technique of approximating solutions to high dimensional partial differential equations using neural networks, and similarly approximating solutions of stochastic differential equations using multilevel Monte Carlo. Connecting the multilevel Monte Carlo method with the neural network framework using the setup established by E et al. and Raissi, we provide novel numerical analyses to produce strong error bounds for the specific framework of Raissi. Our results bound the overall strong error in terms of the maximum of the discretisation error and the neural network's approximation error. Our analyses are necessary for applications of multilevel Monte Carlo, for which we propose suitable frameworks to exploit the variance structures of the multilevel estimators we elucidate. Also, focusing on the loss function advocated by Raissi, we expose the limitations of this, highlighting and quantifying its bias and variance. Lastly, we propose various avenues of further research which we anticipate should offer significant insight and speed improvements.

Convergence proofs and strong error bounds for forward-backward stochastic differential equations using neural network simulations

TL;DR

The paper tackles the challenge of solving high-dimensional forward-backward stochastic differential equations (FBSDEs) by marrying neural network approximations with multilevel Monte Carlo (MLMC) techniques, leveraging the Feynman–Kac link to PDEs. It provides rigorous strong error bounds for both decoupled and coupled forward-backward systems when neural networks approximate the PDE solution, and analyzes the bias and variance of the Raissi loss function, proposing a variance-reducing variant that scales with when Hessians are affordable. A MLMC-inspired training framework is developed to couple neural networks and discretisation levels, including offline/online training regimes and nested-decomposition ideas, with numerical results validating variance structure and regime transitions. The work identifies practical directions for improving loss formulations, variance reduction, and interpolation point choices, aiming to make NN/MLMC approaches for high-dimensional FBSDEs more robust and scalable with provable guarantees.

Abstract

We introduce forward-backward stochastic differential equations, highlighting the connection between solutions of these and solutions of partial differential equations, related by the Feynman-Kac theorem. We review the technique of approximating solutions to high dimensional partial differential equations using neural networks, and similarly approximating solutions of stochastic differential equations using multilevel Monte Carlo. Connecting the multilevel Monte Carlo method with the neural network framework using the setup established by E et al. and Raissi, we provide novel numerical analyses to produce strong error bounds for the specific framework of Raissi. Our results bound the overall strong error in terms of the maximum of the discretisation error and the neural network's approximation error. Our analyses are necessary for applications of multilevel Monte Carlo, for which we propose suitable frameworks to exploit the variance structures of the multilevel estimators we elucidate. Also, focusing on the loss function advocated by Raissi, we expose the limitations of this, highlighting and quantifying its bias and variance. Lastly, we propose various avenues of further research which we anticipate should offer significant insight and speed improvements.

Paper Structure

This paper contains 26 sections, 9 theorems, 61 equations, 1 figure, 2 tables, 2 algorithms.

Key Result

Theorem 2.1

gobet2016montepeng1991probabilisticpardoux1992backwardwu2014probabilisticcohen2015stochastic Let $L$ be the differential operator where and where $a$ and $b$ are the drift and diffusion functions appearing in eqt:fbsde_fsde. Let $u \colon [0, T] \times \mathbb{R}^d \to \mathbb{R}$ be the solution to the semi-linear partial differential equation with terminal boundary condition $u(T, x) = g(x)$

Figures (1)

  • Figure 1: The strong error using the $L^1$-norm from \ref{['eqt:strong_error_lp']} for various two-way and four-way differences of significance to multilevel Monte Carlo settings. Reference lines for a strong convergence order of $\tfrac{1}{2}$ are shown. Terms using a trained model parametrised by some $\theta$ have the number of iterations used to train the model shown in parentheses (e.g. annotating the $\blacktriangle$ markers). Similarly, terms using the same model at two different stages in training with differing numbers of iterations performed are denoted by $\theta$ and $\theta'$, and both the respective iterations are parenthesised (e.g. annotating the $\Diamond$ markers). For some example models, we indicate regions where the relevant error is dominated either by the discretisation error, or the approximation error for an imperfectly trained model. A legend marker which is parenthesised indicates that the results map directly onto those of the unparenthesised marker, but are omitted for clarity.

Theorems & Definitions (25)

  • Remark 2.1
  • Theorem 2.1: the multi-dimensional semi-linear Feynman-Kac theorem
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Definition 3.1
  • Definition 3.2
  • Definition 3.3
  • Definition 3.4
  • Theorem 3.1
  • ...and 15 more