Markov Decision Processes with Recursive Risk Measures
Nicole Bäuerle, Alexander Glauner
TL;DR
This work develops a general framework for risk sensitive Markov decision processes by recursively applying static risk measures over finite and infinite horizons with unbounded costs and Borel spaces. It derives Bellman equations, proves the existence of Markovian optimal policies, and establishes contraction properties ensuring a unique fixed point for infinite horizon problems. The authors connect the recursive risk objective to distributionally robust MDPs via dual representations and extend results to monotone models with weaker continuity assumptions. The framework broadens theoretical foundations for risk aware and robust decision making in economic and financial contexts, enabling scalable analysis beyond standard expected-cost criteria.
Abstract
In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost under both finite and infinite planning horizons. Our optimality criterion is based on the recursive application of static risk measures. This is motivated by recursive utilities in the economic literature, has been studied before for the entropic risk measure and is extended here to an axiomatic characterization of suitable risk measures. We derive a Bellman equation and prove the existence of Markovian optimal policies. For an infinite planning horizon, the model is shown to be contractive and the optimal policy to be stationary. Moreover, we establish a connection to distributionally robust MDPs, which provides a global interpretation of the recursively defined objective function. Monotone models are studied in particular.
