Table of Contents
Fetching ...

Stochastic Optimal Control via Local Occupation Measures

Flemming Holtorf, Alan Edelman, Christopher Rackauckas

TL;DR

This work develops a partitioned, local occupation-measure framework to obtain tight, certifiable bounds for stochastic optimal control problems with diffusion and jump dynamics. By localizing occupation measures on a space-time grid and pairing it with dual piecewise-polynomial value-function approximations, the authors construct MSOS SDP relaxations whose size scales linearly with partition granularity and exploit sparsity across adjacent cells. The approach yields substantial numerical advantages: tighter bounds at lower polynomial degrees, improved numerical conditioning, and potential for distributed solving, demonstrated on a population-control diffusion example and extended via several extensions to discounted horizons, stopping problems, and discrete-state jumps, including a gene-regulation application. Collectively, the framework bridges discretization-based HJB methods and global occupation-measure relaxations, enabling scalable, provable bounds for a broad class of stochastic control problems in engineering and biology.

Abstract

Viewing stochastic processes through the lens of occupation measures has proved to be a powerful angle of attack for the theoretical and computational analysis of stochastic optimal control problems. We present a simple modification of the traditional occupation measure framework derived from resolving the occupation measures locally on a partition of the control problem's space-time domain. This notion of local occupation measures provides fine-grained control over the construction of structured semidefinite programming relaxations for a rich class of stochastic optimal control problems with embedded diffusion and jump processes via the moment-sum-of-squares hierarchy. As such, it bridges the gap between discretization-based approximations to the Hamilton-Jacobi-Bellmann equations and occupation measure relaxations. We demonstrate with examples that this approach enables the computation of high quality bounds for the optimal value of a large class of stochastic optimal control problems with significant performance gains relative to the traditional occupation measure framework.

Stochastic Optimal Control via Local Occupation Measures

TL;DR

This work develops a partitioned, local occupation-measure framework to obtain tight, certifiable bounds for stochastic optimal control problems with diffusion and jump dynamics. By localizing occupation measures on a space-time grid and pairing it with dual piecewise-polynomial value-function approximations, the authors construct MSOS SDP relaxations whose size scales linearly with partition granularity and exploit sparsity across adjacent cells. The approach yields substantial numerical advantages: tighter bounds at lower polynomial degrees, improved numerical conditioning, and potential for distributed solving, demonstrated on a population-control diffusion example and extended via several extensions to discounted horizons, stopping problems, and discrete-state jumps, including a gene-regulation application. Collectively, the framework bridges discretization-based HJB methods and global occupation-measure relaxations, enabling scalable, provable bounds for a broad class of stochastic control problems in engineering and biology.

Abstract

Viewing stochastic processes through the lens of occupation measures has proved to be a powerful angle of attack for the theoretical and computational analysis of stochastic optimal control problems. We present a simple modification of the traditional occupation measure framework derived from resolving the occupation measures locally on a partition of the control problem's space-time domain. This notion of local occupation measures provides fine-grained control over the construction of structured semidefinite programming relaxations for a rich class of stochastic optimal control problems with embedded diffusion and jump processes via the moment-sum-of-squares hierarchy. As such, it bridges the gap between discretization-based approximations to the Hamilton-Jacobi-Bellmann equations and occupation measure relaxations. We demonstrate with examples that this approach enables the computation of high quality bounds for the optimal value of a large class of stochastic optimal control problems with significant performance gains relative to the traditional occupation measure framework.
Paper Structure (19 sections, 3 theorems, 56 equations, 3 figures, 1 table)

This paper contains 19 sections, 3 theorems, 56 equations, 3 figures, 1 table.

Key Result

Corollary 1

Let $w$ be feasible for eq:subHJB and let $\delta_z$ denote the Dirac measure centered at $z$. Then, $w$ underestimates the value function for any $(t,z)\in[0,T] \times X$.

Figures (3)

  • Figure 1: Linear scaling with respect to the the number of grid cells for fixed approximation order
  • Figure 2: Trade-off between computational cost and bound quality for different approximation orders $d$ and domain discretizations $(n_1,n_2,n_T)$. The red markers correspond to MSOS restrictions of the labeled approximation order for the traditional formulation \ref{['eq:subHJB']}.
  • Figure 3: Trade-off between computational cost and bound quality for different approximation orders $d$ and domain partitions (different markers).

Theorems & Definitions (8)

  • Corollary 1
  • proof
  • Theorem 1
  • proof
  • Corollary 2
  • proof : Sketch
  • Remark 1
  • proof