An entropy penalized approach for stochastic optimization with marginal law constraints. Complete version
Thibaut Bourdais, Nadia Oudjane, Francesco Russo
TL;DR
This work tackles stochastic control with constraints in law by formulating the problem on the canonical path space and introducing an entropy-penalized, two-variable relaxation that enables alternating minimization. The authors prove that the penalized formulation approximates the original problem as $\varepsilon\to 0$ and develop an alternating minimization scheme that converges under a Stability Condition, even in the presence of jumps and non-convex costs via a Mixed Variational Inequality framework. They provide results on measurability, flow-existence, and well-posedness for Markovian martingale problems, and illustrate the method with numerical demand-side management experiments, demonstrating practical potential for complex law constraints. By connecting entropy/Donsker-Varadhan representations with stochastic target and martingale transport concepts, the framework unifies several strands of stochastic control under a tractable penalization approach with potential impact in energy systems and model-free hedging contexts.
Abstract
This paper focuses on stochastic optimal control problems with constraints in law, which are rewritten as optimization (minimization) of probability measures problem on the canonical space. We introduce a penalized version of this type of problems by splitting the optimization variable and adding an entropic penalization term. We prove that this penalized version constitutes a good approximation of the original control problem and we provide an alternating procedure which converges, under a so called ''Stability Condition'', to an approximate solution of the original problem. We extend the approach introduced in a previous paperof the same authors including a jump dynamics, non-convex costs and constraints on the marginal laws of the controlled process. The interest of our approach is illustrated by numerical simulations related to demand-side management problems arising in power systems.
