Solving high-dimensional partial differential equations using deep learning

Jiequn Han; Arnulf Jentzen; Weinan E

Solving high-dimensional partial differential equations using deep learning

Jiequn Han, Arnulf Jentzen, Weinan E

TL;DR

Problem: solving high-dimensional semilinear parabolic PDEs is computationally intractable with traditional methods due to the curse of dimensionality. Approach: reformulate the PDEs as backward stochastic differential equations and use deep neural networks to approximate the gradient of the solution, integrating these approximations into a forward-time computation via a deep BSDE network trained with terminal-condition losses. Contributions: the method is demonstrated on 100-dimensional nonlinear Black-Scholes with default risk, a 100-dimensional Hamilton-Jacobi-Bellman equation, and a 100-dimensional Allen-Cahn equation, achieving relative errors under 0.5% and practical runtimes on standard hardware; the architecture supports evaluating u and its gradient along stochastic paths. Significance: enables efficient, scalable solution of complex high-dimensional PDEs in economics, finance, operations research, and physics, broadening the scope of multi-agent and multi-asset models.

Abstract

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.

Solving high-dimensional partial differential equations using deep learning

TL;DR

Abstract

Paper Structure

This paper contains 1 section, 18 equations, 4 figures, 1 table.

Introduction

Figures (4)

Figure 1: Plot of $\theta_{u_0}$ as an approximation of $u(t{=}0,x{=}(100,\dots,100))$ against the number of iteration steps in the case of the $100$-dimensional nonlinear Black-Scholes equation \ref{['eq:PDE_defaultrisk']} with $40$ equidistant time steps ($N{=}40$) and learning rate $0.008$. The shaded area depicts the mean $\pm$ the standard deviation of $\theta_{u_0}$ as an approximation of $u(t{=}0,x{=}(100,\dots,100))$ for 5 independent runs. The deep BSDE method achieves a relative error of size $0.46\%$ in a runtime of $1607$ seconds.
Figure 2: Top: Relative error of the deep BSDE method for $u( t{=}0, x{=}(0,\dots,0) )$ when $\lambda = 1$ against the number of iteration steps in the case of the $100$-dimensional Hamilton-Jacobi-Bellman equation \ref{['eq:PDE_HJB']} with $20$ equidistant time steps ($N{=}20$) and learning rate $0.01$. The shaded area depicts the mean $\pm$ the standard deviation of the relative error for 5 different runs. The deep BSDE method achieves a relative error of size $0.17\%$ in a runtime of $330$ seconds. Bottom: Optimal cost $u(t{=}0,x{=}(0,\dots,0))$ against different values of $\lambda$ in the case of the $100$-dimensional Hamilton-Jacobi-Bellman equation \ref{['eq:PDE_HJB']}, obtained by the deep BSDE method and classical Monte Carlo simulations of \ref{['eq:HJB_formula']}.
Figure 3: Top: Relative error of the deep BSDE method for $u(t{=}0.3,x{=}(0,\dots,0))$ against the number of iteration steps in the case of the $100$-dimensional Allen-Cahn equation \ref{['eq:PDE_allencahn']} with $20$ equidistant time steps ($N{=}20$) and learning rate $0.0005$. The shaded area depicts the mean $\pm$ the standard deviation of the relative error for $5$ different runs. The deep BSDE method achieves a relative error of size $0.30\%$ in a runtime of $647$ seconds. Bottom: Time evolution of $u(t,x{=}(0,\dots,0))$ for $t\in[0,0.3]$ in the case of the $100$-dimensional Allen-Cahn equation \ref{['eq:PDE_allencahn']} computed by means of the deep BSDE method.
Figure 4: Illustration of the network architecture for solving semilinear parabolic PDEs with $H$ hidden layers for each sub-network and $N$ time intervals. The whole network has $(H+1)(N-1)$ layers in total that involve free parameters to be optimized simultaneously. Each column for $t = t_1, t_2,\dots,t_{N-1}$ corresponds to a sub-network at time $t$. $h^1_n, \dots, h^H_n$ are the intermediate neurons in the sub-network at time $t=t_n$ for $n= 1, 2, \dots, N - 1$.

Solving high-dimensional partial differential equations using deep learning

TL;DR

Abstract

Solving high-dimensional partial differential equations using deep learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)