Convergence proofs and strong error bounds for forward-backward stochastic differential equations using neural network simulations
Oliver Sheridan-Methven
TL;DR
The paper tackles the challenge of solving high-dimensional forward-backward stochastic differential equations (FBSDEs) by marrying neural network approximations with multilevel Monte Carlo (MLMC) techniques, leveraging the Feynman–Kac link to PDEs. It provides rigorous strong error bounds for both decoupled and coupled forward-backward systems when neural networks approximate the PDE solution, and analyzes the bias and variance of the Raissi loss function, proposing a variance-reducing variant that scales with $\Delta t^{1/2}$ when Hessians are affordable. A MLMC-inspired training framework is developed to couple neural networks and discretisation levels, including offline/online training regimes and nested-decomposition ideas, with numerical results validating variance structure and regime transitions. The work identifies practical directions for improving loss formulations, variance reduction, and interpolation point choices, aiming to make NN/MLMC approaches for high-dimensional FBSDEs more robust and scalable with provable guarantees.
Abstract
We introduce forward-backward stochastic differential equations, highlighting the connection between solutions of these and solutions of partial differential equations, related by the Feynman-Kac theorem. We review the technique of approximating solutions to high dimensional partial differential equations using neural networks, and similarly approximating solutions of stochastic differential equations using multilevel Monte Carlo. Connecting the multilevel Monte Carlo method with the neural network framework using the setup established by E et al. and Raissi, we provide novel numerical analyses to produce strong error bounds for the specific framework of Raissi. Our results bound the overall strong error in terms of the maximum of the discretisation error and the neural network's approximation error. Our analyses are necessary for applications of multilevel Monte Carlo, for which we propose suitable frameworks to exploit the variance structures of the multilevel estimators we elucidate. Also, focusing on the loss function advocated by Raissi, we expose the limitations of this, highlighting and quantifying its bias and variance. Lastly, we propose various avenues of further research which we anticipate should offer significant insight and speed improvements.
