Particle-based algorithm for stochastic optimal control
Sebastian Reich
TL;DR
This work recasts stochastic optimal control as a pair of forward and reverse McKean–Vlasov SDEs and links the value function to the ratio of forward and reverse densities through a Cole–Hopf type transform. It develops a particle-based algorithm that fuses ensemble Kalman filtering with diffusion-map techniques to approximate the necessary drift and grad-log terms, yielding a time-dependent affine control $u_t(x)= R G(x)^T (A_t x + c_t)$. The approach is illustrated on nonlinear problems (inverted pendulum and controlled Langevin dynamics), showing that small ensembles (as few as $M=d_x+1$) can achieve robust stabilization when supplemented with diffusion-map refinements. The framework bridges diffusion-based generative modeling and stochastic control, offering a scalable, flexible route for high-dimensional control tasks and a basis for future diffusion-map enhancements and infinite-horizon extensions.
Abstract
The solution to a stochastic optimal control problem can be determined by computing the value function from a discretization of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in diffusion-based generative models. Forward and reverse time formulations express the value function as the ratio of two probability density functions; one stemming from a forward McKean-Vlasov SDE and another one from a reverse McKean-Vlasov SDE. In this paper, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type and diffusion map approximation techniques in order to obtain efficient and robust particle-based algorithms.
