Table of Contents
Fetching ...

Time-Reversal of Stochastic Maximum Principle

Amirhossein Taghvaei

TL;DR

Time-Reversal of Stochastic Maximum Principle addresses the numerical difficulty of solving the SMP FBSDE by introducing a time-reversed diffusion with a backward Wiener process and Föllmer’s drift, recasting the problem as a mean-field control. The authors establish an equivalence in law between the forward and reversed dynamics and propose a Monte Carlo iterative scheme that solves both the FBSDE and its time-reversal, using mean/covariance drift estimates and regression-based adjoint updates. In the linear-quadratic setting, they derive explicit affine control structures and show how the adjoint processes can be obtained via Riccati-like equations or regression for $G_1(t)$, with a concrete two-dimensional example. The approach offers a practical alternative to PDE-based methods for SMP, with potential extensions to nonlinear problems through score-function regression and high-dimensional mean-field techniques, enabling scalable stochastic control synthesis.

Abstract

Stochastic maximum principle (SMP) specifies a necessary condition for the solution of a stochastic optimal control problem. The condition involves a coupled system of forward and backward stochastic differential equations (FBSDE) for the state and the adjoint processes. Numerical solution of the FBSDE is challenging because the boundary condition of the adjoint process is specified at the terminal time, while the solution should be adaptable to the forward in time filtration of a Wiener process. In this paper, a "time-reversal" of the FBSDE system is proposed that involves integration with respect to a backward in time Wiener process. The time-reversal is used to propose an iterative Monte-Carlo procedure to solves the FBSDE system and its time-reversal simultaneously. The procedure involves approximating the {Föllmer's drift} and solving a regression problem between the state and its adjoint at each time. The procedure is illustrated for the linear quadratic (LQ) optimal control problem with a numerical example.

Time-Reversal of Stochastic Maximum Principle

TL;DR

Time-Reversal of Stochastic Maximum Principle addresses the numerical difficulty of solving the SMP FBSDE by introducing a time-reversed diffusion with a backward Wiener process and Föllmer’s drift, recasting the problem as a mean-field control. The authors establish an equivalence in law between the forward and reversed dynamics and propose a Monte Carlo iterative scheme that solves both the FBSDE and its time-reversal, using mean/covariance drift estimates and regression-based adjoint updates. In the linear-quadratic setting, they derive explicit affine control structures and show how the adjoint processes can be obtained via Riccati-like equations or regression for , with a concrete two-dimensional example. The approach offers a practical alternative to PDE-based methods for SMP, with potential extensions to nonlinear problems through score-function regression and high-dimensional mean-field techniques, enabling scalable stochastic control synthesis.

Abstract

Stochastic maximum principle (SMP) specifies a necessary condition for the solution of a stochastic optimal control problem. The condition involves a coupled system of forward and backward stochastic differential equations (FBSDE) for the state and the adjoint processes. Numerical solution of the FBSDE is challenging because the boundary condition of the adjoint process is specified at the terminal time, while the solution should be adaptable to the forward in time filtration of a Wiener process. In this paper, a "time-reversal" of the FBSDE system is proposed that involves integration with respect to a backward in time Wiener process. The time-reversal is used to propose an iterative Monte-Carlo procedure to solves the FBSDE system and its time-reversal simultaneously. The procedure involves approximating the {Föllmer's drift} and solving a regression problem between the state and its adjoint at each time. The procedure is illustrated for the linear quadratic (LQ) optimal control problem with a numerical example.
Paper Structure (14 sections, 5 theorems, 46 equations, 1 figure)

This paper contains 14 sections, 5 theorems, 46 equations, 1 figure.

Key Result

Lemma 1

The SDE eq:reverse-sde is equivalent to the SDE where $\tilde{W}_t := W_{T-t} - W_T$ and $\tilde{X}_t := \cev X_{T-t}$ is adapted to the forward filtration $\tilde{\mathcal{F}}_t :=\sigma\{\tilde{W}_s;0\leq s\leq t\}$, for $t\in[0,T]$.

Figures (1)

  • Figure 1: Numerical results for a two-dimensional linear quadratic stochastic optimal control problem presented in Sec. \ref{['sec:num-example']}: (a) Sampled trajectories of the state \ref{['eq:alg-forward']}, its time-reversal \ref{['eq:alg-reverse']}, and the adjoint process \ref{['eq:alg-adjoint']}; (b) numerical approxiamtion of the matrix $G(t)$ according to \ref{['eq:regression']} in comparison to its exact value according to \ref{['eq:Ricatti']}; (c) value of the objective function \ref{['eq:control-cost']} as the number of algorithm iterations increases.

Theorems & Definitions (11)

  • Lemma 1
  • proof
  • Theorem 1
  • Remark 1
  • Proposition 1
  • Remark 2
  • Lemma 2
  • proof
  • Proposition 2
  • proof
  • ...and 1 more