Risk Estimation in a Markov Cost Process: Lower and Upper Bounds
Gugan Thoppe, L. A. Prashanth, Sanjay Bhat
TL;DR
The paper tackles risk-sensitive evaluation in infinite-horizon discounted Markov Cost Processes, focusing on variance, VaR, and CVaR. It proves minimax lower bounds of $Ω\left(1/ε^2\right)$ for risk estimation and develops truncation-based estimators that achieve matching upper bounds up to logarithmic factors, while also extending to Lipschitz-continuous risk measures. The results cover both deterministic and stochastic cost settings and provide detailed proofs for VaR, CVaR, and variance estimation, including concentration guarantees. This work is the first to establish nontrivial lower and upper bounds for risk measures beyond the mean in a Markovian setting, offering theoretical guidance for risk-aware policy evaluation and planning in RL.
Abstract
We tackle the problem of estimating risk measures of the infinite-horizon discounted cost within a Markov cost process. The risk measures we study include variance, Value-at-Risk (VaR), and Conditional Value-at-Risk (CVaR). First, we show that estimating any of these risk measures with $ε$-accuracy, either in expected or high-probability sense, requires at least $Ω(1/ε^2)$ samples. Then, using a truncation scheme, we derive an upper bound for the CVaR and variance estimation. This bound matches our lower bound up to logarithmic factors. Finally, we discuss an extension of our estimation scheme that covers more general risk measures satisfying a certain continuity criterion, e.g., spectral risk measures, utility-based shortfall risk. To the best of our knowledge, our work is the first to provide lower and upper bounds for estimating any risk measure beyond the mean within a Markovian setting. Our lower bounds also extend to the infinite-horizon discounted costs' mean. Even in that case, our lower bound of $Ω(1/ε^2) $ improves upon the existing $Ω(1/ε)$ bound [13].
