Table of Contents
Fetching ...

An entropy penalized approach for stochastic optimization with marginal law constraints. Complete version

Thibaut Bourdais, Nadia Oudjane, Francesco Russo

TL;DR

This work tackles stochastic control with constraints in law by formulating the problem on the canonical path space and introducing an entropy-penalized, two-variable relaxation that enables alternating minimization. The authors prove that the penalized formulation approximates the original problem as $\varepsilon\to 0$ and develop an alternating minimization scheme that converges under a Stability Condition, even in the presence of jumps and non-convex costs via a Mixed Variational Inequality framework. They provide results on measurability, flow-existence, and well-posedness for Markovian martingale problems, and illustrate the method with numerical demand-side management experiments, demonstrating practical potential for complex law constraints. By connecting entropy/Donsker-Varadhan representations with stochastic target and martingale transport concepts, the framework unifies several strands of stochastic control under a tractable penalization approach with potential impact in energy systems and model-free hedging contexts.

Abstract

This paper focuses on stochastic optimal control problems with constraints in law, which are rewritten as optimization (minimization) of probability measures problem on the canonical space. We introduce a penalized version of this type of problems by splitting the optimization variable and adding an entropic penalization term. We prove that this penalized version constitutes a good approximation of the original control problem and we provide an alternating procedure which converges, under a so called ''Stability Condition'', to an approximate solution of the original problem. We extend the approach introduced in a previous paperof the same authors including a jump dynamics, non-convex costs and constraints on the marginal laws of the controlled process. The interest of our approach is illustrated by numerical simulations related to demand-side management problems arising in power systems.

An entropy penalized approach for stochastic optimization with marginal law constraints. Complete version

TL;DR

This work tackles stochastic control with constraints in law by formulating the problem on the canonical path space and introducing an entropy-penalized, two-variable relaxation that enables alternating minimization. The authors prove that the penalized formulation approximates the original problem as and develop an alternating minimization scheme that converges under a Stability Condition, even in the presence of jumps and non-convex costs via a Mixed Variational Inequality framework. They provide results on measurability, flow-existence, and well-posedness for Markovian martingale problems, and illustrate the method with numerical demand-side management experiments, demonstrating practical potential for complex law constraints. By connecting entropy/Donsker-Varadhan representations with stochastic target and martingale transport concepts, the framework unifies several strands of stochastic control under a tractable penalization approach with potential impact in energy systems and model-free hedging contexts.

Abstract

This paper focuses on stochastic optimal control problems with constraints in law, which are rewritten as optimization (minimization) of probability measures problem on the canonical space. We introduce a penalized version of this type of problems by splitting the optimization variable and adding an entropic penalization term. We prove that this penalized version constitutes a good approximation of the original control problem and we provide an alternating procedure which converges, under a so called ''Stability Condition'', to an approximate solution of the original problem. We extend the approach introduced in a previous paperof the same authors including a jump dynamics, non-convex costs and constraints on the marginal laws of the controlled process. The interest of our approach is illustrated by numerical simulations related to demand-side management problems arising in power systems.

Paper Structure

This paper contains 19 sections, 32 theorems, 125 equations, 2 figures.

Key Result

Proposition 2.6

Let ${\mathbb P} \in {\cal P}(\Omega)$. Let $\mu$ be a random measure with ${\mathbb P}$-compensator $\mu^L := dtL(t, X, dq),$ where $L$ is a Lévy kernel in the sense of Definition def:levyKernel. Let $W$ be a $\tilde{{\cal P}}$-measurable function and let $b_0 > 0$. The following statements are equ

Figures (2)

  • Figure 1: Costs associated with the iterates generated by the entropy penalized Monte-Carlo algorithm in dimension $d = 5$ and $K = 100$.
  • Figure 2: Comparison of the estimated marginal densities of $X_T$ with their respective target densities in dimension $d = 5$ with $K = 100$.

Theorems & Definitions (92)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Remark 2.4
  • Definition 2.5
  • Proposition 2.6
  • Definition 2.7
  • Remark 2.8
  • Definition 2.9
  • Definition 2.10
  • ...and 82 more