Robust Stochastic Optimal Control via variance penalization: Application to Energy Management Systems
Paul Malisani, Adrien Spagnol, Vivien Smis-Michel
TL;DR
The paper tackles robust stochastic optimal control under convexity, addressing the optimizer's curse by introducing a variance-penalized objective and a Douglas–Rachford–based solver, the Variance-Penalized Progressive Hedging Algorithm ($\mathrm{VPPHA}$). It develops a data-driven framework consisting of scenario generation and reduction to enable tractable, scenario-based control, and proves convergence of the VPPHA to the optimal solution under convexity. The approach is instantiated in a rolling-horizon energy management system for a stationary battery, using real consumption and production data to compare against MPC and standard PHA; results show that VPPHA yields superior out-of-sample performance and greater bill reductions, especially during volatile pricing periods. The work demonstrates that variance penalization can enhance robustness without increasing computational burden, offering a practical path to robust EMS deployments through scalable, scenario-based optimization. Together, these contributions advance robust stochastic control with scalable algorithms and data-driven uncertainty representation for energy systems.
Abstract
This paper addresses a class of robust stochastic optimal control problems. Its main contribution lies in the introduction of a general optimization model with variance penalization and an associated solution algorithm that improves out-of-sample robustness while preserving numerical complexity. The proposed variance-penalized model is inspired by a well-established machine learning practice that aims to limit overfitting and extends this idea to stochastic optimal control. Using the Douglas--Rachford splitting method, the authors develop a Variance-Penalized Progressive Hedging Algorithm (VPPHA) that retains the computational complexity of the standard PHA while achieving superior out-of-sample performance. In addition, the authors propose a three-step control framework comprising (i) a random scenario generation method, (ii) a scenario reduction algorithm, and (iii) a scenario-based optimal control computation using the VPPHA. Finally, the proposed method is validated through simulations of a stationary battery Energy Management System (EMS) using ground-truth electricity consumption and production measurements from a predominantly commercial building in Solaize, France. The results demonstrate that the proposed approach outperforms a classical Model Predictive Control (MPC) strategy, which itself performs better than the standard PHA.
