Method of Successive Approximations for Stochastic Optimal Control: Contractivity and Convergence
Safouane Taoufik, Badr Missaoui
TL;DR
This work addresses solving stochastic optimal control problems where closed-form solutions are intractable by employing the Method of Successive Approximations (MSA), a fixed-point scheme rooted in the Stochastic Maximum Principle. It analyzes a class of systems with drift that is one-sided-Lipschitz (negative constant) and diffusion that is Lipschitz, introducing a contraction parameter $\mu = -(c+(L^{x}_{\sigma})^2/2) > 0$ and proving stability and boundedness of the state and adjoint processes. The authors establish the contractivity of the MSA operator via a Lipschitz bound with constant $L_{\mu,T}$ and show convergence when $L_{\mu,T}<1$, providing explicit rate bounds and highlighting how horizon $T$ and $\mu$ influence convergence. These results furnish rigorous convergence guarantees for MSA in stochastic control and offer guidance for parameter tuning, with potential implications for related optimization and learning settings.
Abstract
The Method of Successive Approximations (MSA) is a fixed-point iterative method used to solve stochastic optimal control problems. It is an indirect method based on the conditions derived from the Stochastic Maximum Principle (SMP), an extension of the Pontryagin Maximum Principle (PMP) to stochastic control problems. In this study, we investigate the contractivity and the convergence of MSA for a specific and interesting class of stochastic dynamical systems (when the drift coefficient is one-sided-Lipschitz with a negative constant and the diffusion coefficient is Lipschitz continuous). Our analysis unfolds in three key steps: firstly, we prove the stability of the state process with respect to the control process. Secondly, we establish the stability of the adjoint process. Finally, we present rigorous evidence to prove the contractivity and then the convergence of MSA. This study contributes to enhancing the understanding of MSA's applicability and effectiveness in addressing stochastic optimal control problems.
