A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations
Lorenc Kapllani, Long Teng
TL;DR
The paper tackles the numerical solution of high-dimensional nonlinear BSDEs by introducing a backward differential deep learning (DLBDP) framework that uses Malliavin calculus to recast the BSDE into a differential learning problem and jointly learns the triple $(Y, Z, \Gamma)$. The method discretizes the system with Euler-Maruyama and deploys three DNNs per time step to approximate the unknown processes, optimizing a differential loss that enforces the discretized dynamics and improves derivative information. A convergence analysis under standard Regularity assumptions shows the scheme converges with a rate tied to the discretization error and DNN approximation error, with explicit control via weights $\omega_1,\omega_2$. Numerical experiments up to 50 dimensions demonstrate that DLBDP yields substantial accuracy gains for $Z$ and especially $\Gamma$ compared to prior DL methods, while also offering favorable computational efficiency, indicating strong potential for high-dimensional option pricing and hedging applications.
Abstract
In this work, we propose a novel backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations (BSDEs), where the deep neural network (DNN) models are trained not only on the inputs and labels but also the differentials of the corresponding labels. This is motivated by the fact that differential deep learning can provide an efficient approximation of the labels and their derivatives with respect to inputs. The BSDEs are reformulated as differential deep learning problems by using Malliavin calculus. The Malliavin derivatives of solution to a BSDE satisfy themselves another BSDE, resulting thus in a system of BSDEs. Such formulation requires the estimation of the solution, its gradient, and the Hessian matrix, represented by the triple of processes $\left(Y, Z, Γ\right).$ All the integrals within this system are discretized by using the Euler-Maruyama method. Subsequently, DNNs are employed to approximate the triple of these unknown processes. The DNN parameters are backwardly optimized at each time step by minimizing a differential learning type loss function, which is defined as a weighted sum of the dynamics of the discretized BSDE system, with the first term providing the dynamics of the process $Y$ and the other the process $Z$. An error analysis is carried out to show the convergence of the proposed algorithm. Various numerical experiments up to $50$ dimensions are provided to demonstrate the high efficiency. Both theoretically and numerically, it is demonstrated that our proposed scheme is more efficient compared to other contemporary deep learning-based methodologies, especially in the computation of the process $Γ$.
