Table of Contents
Fetching ...

Time-reversal solution of BSDEs in stochastic optimal control: a linear quadratic study

Yuhang Mei, Amirhossein Taghvaei

TL;DR

The paper addresses numerical solutions for BSDEs arising in stochastic optimal control by comparing least-squares Monte Carlo (LSMC) and time-reversal (TR) approaches within a unified framework, focusing on both value-function and co-state BSDEs and benchmarking against the linear-quadratic (LQ) case. It demonstrates that the TR method delivers significantly higher accuracy and stability, particularly for the co-state BSDE and in higher dimensions, and provides explicit LQ formulas to validate results. The findings suggest that time-reversal techniques, including the score-based drift estimation, offer a robust alternative to conditional-expectation regression for solving SOC BSDEs with potential applicability beyond the LQ setting. The work highlights practical implications for efficiently solving stochastic control problems and motivates further exploration into nonlinear extensions.

Abstract

This paper addresses the numerical solution of backward stochastic differential equations (BSDEs) arising in stochastic optimal control. Specifically, we investigate two BSDEs: one derived from the Hamilton-Jacobi-Bellman equation and the other from the stochastic maximum principle. For both formulations, we analyze and compare two numerical methods. The first utilizes the least-squares Monte-Carlo (LSMC) approach for approximating conditional expectations, while the second leverages a time-reversal (TR) of diffusion processes. Although both methods extend to nonlinear settings, our focus is on the linear-quadratic case, where analytical solutions provide a benchmark. Numerical results demonstrate the superior accuracy and efficiency of the TR approach across both BSDE representations, highlighting its potential for broader applications in stochastic control.

Time-reversal solution of BSDEs in stochastic optimal control: a linear quadratic study

TL;DR

The paper addresses numerical solutions for BSDEs arising in stochastic optimal control by comparing least-squares Monte Carlo (LSMC) and time-reversal (TR) approaches within a unified framework, focusing on both value-function and co-state BSDEs and benchmarking against the linear-quadratic (LQ) case. It demonstrates that the TR method delivers significantly higher accuracy and stability, particularly for the co-state BSDE and in higher dimensions, and provides explicit LQ formulas to validate results. The findings suggest that time-reversal techniques, including the score-based drift estimation, offer a robust alternative to conditional-expectation regression for solving SOC BSDEs with potential applicability beyond the LQ setting. The work highlights practical implications for efficiently solving stochastic control problems and motivates further exploration into nonlinear extensions.

Abstract

This paper addresses the numerical solution of backward stochastic differential equations (BSDEs) arising in stochastic optimal control. Specifically, we investigate two BSDEs: one derived from the Hamilton-Jacobi-Bellman equation and the other from the stochastic maximum principle. For both formulations, we analyze and compare two numerical methods. The first utilizes the least-squares Monte-Carlo (LSMC) approach for approximating conditional expectations, while the second leverages a time-reversal (TR) of diffusion processes. Although both methods extend to nonlinear settings, our focus is on the linear-quadratic case, where analytical solutions provide a benchmark. Numerical results demonstrate the superior accuracy and efficiency of the TR approach across both BSDE representations, highlighting its potential for broader applications in stochastic control.
Paper Structure (13 sections, 41 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 13 sections, 41 equations, 4 figures, 1 table, 2 algorithms.

Figures (4)

  • Figure 1: Numerical result for Section \ref{['sec:numerics-accuracy']}: The entries of the matrix $G_t$, obtained from the four algorithms (a) LSMC-V, (b) LSMC-C, (c) TR-V, and (d) TR-C, in comparison to their exact values. The solid line denotes the exact solution and the dotted line denotes the numerical approximation from algorithms. The time horizon $T=4$, time step-size $\Delta t = 0.02$, and sample size $N=2000$.
  • Figure 2: Numerical result for Section \ref{['sec:numerics-accuracy']}: The value of the SOC cost \ref{['eq:SOC']} in the first 15 iterations of the four methods.
  • Figure 3: Numerical results Sec. \ref{['sec:dt-sample']}. (a) MSE of four methods with $N=1000$ and different time step-size $\Delta t$; The experiment is done with time horizon $T=4$, and sample size $N=1000$. (b) MSE of four methods with $\Delta t=0.02$ and different number of samples $N$. The shaded region represents the range from the minimum to the maximum across 15 experiments.
  • Figure 4: Numerical results for Section \ref{['sec:numerics-dim']}: MSE of four methods as the problem dimension $n$ varies. Note that the MSE is normalized by the factor $n^2$. The shaded region represents the range from the minimum to the maximum across 15 experiments.

Theorems & Definitions (1)

  • Remark 1