Stochastic Optimal Control via Local Occupation Measures

Flemming Holtorf; Alan Edelman; Christopher Rackauckas

Stochastic Optimal Control via Local Occupation Measures

Flemming Holtorf, Alan Edelman, Christopher Rackauckas

TL;DR

This work develops a partitioned, local occupation-measure framework to obtain tight, certifiable bounds for stochastic optimal control problems with diffusion and jump dynamics. By localizing occupation measures on a space-time grid and pairing it with dual piecewise-polynomial value-function approximations, the authors construct MSOS SDP relaxations whose size scales linearly with partition granularity and exploit sparsity across adjacent cells. The approach yields substantial numerical advantages: tighter bounds at lower polynomial degrees, improved numerical conditioning, and potential for distributed solving, demonstrated on a population-control diffusion example and extended via several extensions to discounted horizons, stopping problems, and discrete-state jumps, including a gene-regulation application. Collectively, the framework bridges discretization-based HJB methods and global occupation-measure relaxations, enabling scalable, provable bounds for a broad class of stochastic control problems in engineering and biology.

Abstract

Viewing stochastic processes through the lens of occupation measures has proved to be a powerful angle of attack for the theoretical and computational analysis of stochastic optimal control problems. We present a simple modification of the traditional occupation measure framework derived from resolving the occupation measures locally on a partition of the control problem's space-time domain. This notion of local occupation measures provides fine-grained control over the construction of structured semidefinite programming relaxations for a rich class of stochastic optimal control problems with embedded diffusion and jump processes via the moment-sum-of-squares hierarchy. As such, it bridges the gap between discretization-based approximations to the Hamilton-Jacobi-Bellmann equations and occupation measure relaxations. We demonstrate with examples that this approach enables the computation of high quality bounds for the optimal value of a large class of stochastic optimal control problems with significant performance gains relative to the traditional occupation measure framework.

Stochastic Optimal Control via Local Occupation Measures

TL;DR

Abstract

Paper Structure (19 sections, 3 theorems, 56 equations, 3 figures, 1 table)

This paper contains 19 sections, 3 theorems, 56 equations, 3 figures, 1 table.

Introduction
Problem description & preliminaries
The dual perspective revisited: piecewise polynomial approximation
The primal perspective revisited: local occupation measures
Moment-sum-of-squares approximations: structure & scaling
Example: population control
Control problem
Partition of problem domain
Evaluation of bound quality
Computational aspects
Results
Extensions
Discounted infinite horizon problems
Stopped control problems
Jump processes with discrete state space
...and 4 more sections

Key Result

Corollary 1

Let $w$ be feasible for eq:subHJB and let $\delta_z$ denote the Dirac measure centered at $z$. Then, $w$ underestimates the value function for any $(t,z)\in[0,T] \times X$.

Figures (3)

Figure 1: Linear scaling with respect to the the number of grid cells for fixed approximation order
Figure 2: Trade-off between computational cost and bound quality for different approximation orders $d$ and domain discretizations $(n_1,n_2,n_T)$. The red markers correspond to MSOS restrictions of the labeled approximation order for the traditional formulation \ref{['eq:subHJB']}.
Figure 3: Trade-off between computational cost and bound quality for different approximation orders $d$ and domain partitions (different markers).

Theorems & Definitions (8)

Corollary 1
proof
Theorem 1
proof
Corollary 2
proof : Sketch
Remark 1
proof

Stochastic Optimal Control via Local Occupation Measures

TL;DR

Abstract

Stochastic Optimal Control via Local Occupation Measures

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (8)