Stochastic Optimal Control via Local Occupation Measures
Flemming Holtorf, Alan Edelman, Christopher Rackauckas
TL;DR
This work develops a partitioned, local occupation-measure framework to obtain tight, certifiable bounds for stochastic optimal control problems with diffusion and jump dynamics. By localizing occupation measures on a space-time grid and pairing it with dual piecewise-polynomial value-function approximations, the authors construct MSOS SDP relaxations whose size scales linearly with partition granularity and exploit sparsity across adjacent cells. The approach yields substantial numerical advantages: tighter bounds at lower polynomial degrees, improved numerical conditioning, and potential for distributed solving, demonstrated on a population-control diffusion example and extended via several extensions to discounted horizons, stopping problems, and discrete-state jumps, including a gene-regulation application. Collectively, the framework bridges discretization-based HJB methods and global occupation-measure relaxations, enabling scalable, provable bounds for a broad class of stochastic control problems in engineering and biology.
Abstract
Viewing stochastic processes through the lens of occupation measures has proved to be a powerful angle of attack for the theoretical and computational analysis of stochastic optimal control problems. We present a simple modification of the traditional occupation measure framework derived from resolving the occupation measures locally on a partition of the control problem's space-time domain. This notion of local occupation measures provides fine-grained control over the construction of structured semidefinite programming relaxations for a rich class of stochastic optimal control problems with embedded diffusion and jump processes via the moment-sum-of-squares hierarchy. As such, it bridges the gap between discretization-based approximations to the Hamilton-Jacobi-Bellmann equations and occupation measure relaxations. We demonstrate with examples that this approach enables the computation of high quality bounds for the optimal value of a large class of stochastic optimal control problems with significant performance gains relative to the traditional occupation measure framework.
