Table of Contents
Fetching ...

Multistep schemes for solving backward stochastic differential equations on GPU

Lorenc Kapllani, Long Teng

TL;DR

This work tackles the computational challenge of solving backward stochastic differential equations (BSDEs) by parallelizing the high-order multistep scheme on GPUs. The authors develop a CUDA-based framework that uses a uniform time-space grid, cubic/bicubic interpolation, and Gauss-Hermite quadrature to approximate conditional expectations, with BiCGSTAB solvers for the resulting linear systems. They demonstrate substantial speedups (up to ~70×) and high accuracy across multiple BSDE test cases, including linear and nonlinear drivers, the Black-Scholes FBSDE, and a 2D spread option, validating the practicality of high-order BSDE solvers in financial applications. The approach hinges on efficient non-grid point localization, memory-friendly kernel design, and iterative CUDA optimizations, enabling practical deployment of sophisticated BSDE methods in real-time pricing contexts.

Abstract

The goal of this work is to parallelize the multistep scheme for the numerical approximation of the backward stochastic differential equations (BSDEs) in order to achieve both, a high accuracy and a reduction of the computation time as well. In the multistep scheme the computations at each grid point are independent and this fact motivates us to select massively parallel GPU computing using CUDA. In our investigations we identify performance bottlenecks and apply appropriate optimization techniques for reducing the computation time, using a uniform domain. Finally, some examples with financial applications are provided to demonstrate the achieved acceleration on GPUs.

Multistep schemes for solving backward stochastic differential equations on GPU

TL;DR

This work tackles the computational challenge of solving backward stochastic differential equations (BSDEs) by parallelizing the high-order multistep scheme on GPUs. The authors develop a CUDA-based framework that uses a uniform time-space grid, cubic/bicubic interpolation, and Gauss-Hermite quadrature to approximate conditional expectations, with BiCGSTAB solvers for the resulting linear systems. They demonstrate substantial speedups (up to ~70×) and high accuracy across multiple BSDE test cases, including linear and nonlinear drivers, the Black-Scholes FBSDE, and a 2D spread option, validating the practicality of high-order BSDE solvers in financial applications. The approach hinges on efficient non-grid point localization, memory-friendly kernel design, and iterative CUDA optimizations, enabling practical deployment of sophisticated BSDE methods in real-time pricing contexts.

Abstract

The goal of this work is to parallelize the multistep scheme for the numerical approximation of the backward stochastic differential equations (BSDEs) in order to achieve both, a high accuracy and a reduction of the computation time as well. In the multistep scheme the computations at each grid point are independent and this fact motivates us to select massively parallel GPU computing using CUDA. In our investigations we identify performance bottlenecks and apply appropriate optimization techniques for reducing the computation time, using a uniform domain. Finally, some examples with financial applications are provided to demonstrate the achieved acceleration on GPUs.

Paper Structure

This paper contains 13 sections, 3 theorems, 53 equations, 5 figures, 12 tables.

Key Result

Lemma 2.1

The local estimates of the local truncation errors in eq16 satisfy where $C > 0$ is a constant depending on $T$, $f$, $g$ and the derivatives of $f$ and $g$.

Figures (5)

  • Figure 1: Plots of naive results for Example \ref{['ex1']}.
  • Figure 2: Plots of naive results for Example \ref{['ex2']}.
  • Figure 3: Plots of naive results for Example \ref{['ex3']}.
  • Figure 4: Plots of naive results for Example \ref{['ex4']}.
  • Figure 5: Plots of naive results for Example \ref{['ex5']}.

Theorems & Definitions (9)

  • Lemma 2.1
  • Theorem 2.1
  • Theorem 2.2
  • Remark 2.3
  • Example 5.1
  • Example 5.2
  • Example 5.3
  • Example 5.4
  • Example 5.5