Risk-averse formulations of Stochastic Optimal Control and Markov Decision Processes
Alexander Shapiro, Yan Li
TL;DR
This paper develops a unified framework for risk-averse and distributionally robust optimization in sequential decision problems, focusing on Stochastic Optimal Control (SOC) and Markov Decision Processes (MDP). It introduces conditional and nested risk functionals, highlights Value-at-Risk ($V@R$), and derives necessary and sufficient conditions for the existence of non-randomized optimal policies, along with finite- and infinite-horizon dynamic programs. It provides VaR-based sample-complexity results, analyzes rectangularity and nested robust formulations, and discusses implications for policy existence and computational tractability. Overall, the work connects risk-averse and distributionally robust viewpoints in multistage settings and offers practical guidance for decision-making under model uncertainty and data limitations.
Abstract
The aim of this paper is to investigate risk-averse and distributionally robust modeling of Stochastic Optimal Control (SOC) and Markov Decision Process (MDP). We discuss construction of conditional nested risk functionals, a particular attention is given to the Value-at-Risk measure. Necessary and sufficient conditions for existence of non-randomized optimal policies in the framework of robust SOC and MDP are derived. We also investigate sample complexity of optimization problems involving the Value-at-Risk measure.
