Table of Contents
Fetching ...

Risk Estimation in a Markov Cost Process: Lower and Upper Bounds

Gugan Thoppe, L. A. Prashanth, Sanjay Bhat

TL;DR

The paper tackles risk-sensitive evaluation in infinite-horizon discounted Markov Cost Processes, focusing on variance, VaR, and CVaR. It proves minimax lower bounds of $Ω\left(1/ε^2\right)$ for risk estimation and develops truncation-based estimators that achieve matching upper bounds up to logarithmic factors, while also extending to Lipschitz-continuous risk measures. The results cover both deterministic and stochastic cost settings and provide detailed proofs for VaR, CVaR, and variance estimation, including concentration guarantees. This work is the first to establish nontrivial lower and upper bounds for risk measures beyond the mean in a Markovian setting, offering theoretical guidance for risk-aware policy evaluation and planning in RL.

Abstract

We tackle the problem of estimating risk measures of the infinite-horizon discounted cost within a Markov cost process. The risk measures we study include variance, Value-at-Risk (VaR), and Conditional Value-at-Risk (CVaR). First, we show that estimating any of these risk measures with $ε$-accuracy, either in expected or high-probability sense, requires at least $Ω(1/ε^2)$ samples. Then, using a truncation scheme, we derive an upper bound for the CVaR and variance estimation. This bound matches our lower bound up to logarithmic factors. Finally, we discuss an extension of our estimation scheme that covers more general risk measures satisfying a certain continuity criterion, e.g., spectral risk measures, utility-based shortfall risk. To the best of our knowledge, our work is the first to provide lower and upper bounds for estimating any risk measure beyond the mean within a Markovian setting. Our lower bounds also extend to the infinite-horizon discounted costs' mean. Even in that case, our lower bound of $Ω(1/ε^2) $ improves upon the existing $Ω(1/ε)$ bound [13].

Risk Estimation in a Markov Cost Process: Lower and Upper Bounds

TL;DR

The paper tackles risk-sensitive evaluation in infinite-horizon discounted Markov Cost Processes, focusing on variance, VaR, and CVaR. It proves minimax lower bounds of for risk estimation and develops truncation-based estimators that achieve matching upper bounds up to logarithmic factors, while also extending to Lipschitz-continuous risk measures. The results cover both deterministic and stochastic cost settings and provide detailed proofs for VaR, CVaR, and variance estimation, including concentration guarantees. This work is the first to establish nontrivial lower and upper bounds for risk measures beyond the mean in a Markovian setting, offering theoretical guidance for risk-aware policy evaluation and planning in RL.

Abstract

We tackle the problem of estimating risk measures of the infinite-horizon discounted cost within a Markov cost process. The risk measures we study include variance, Value-at-Risk (VaR), and Conditional Value-at-Risk (CVaR). First, we show that estimating any of these risk measures with -accuracy, either in expected or high-probability sense, requires at least samples. Then, using a truncation scheme, we derive an upper bound for the CVaR and variance estimation. This bound matches our lower bound up to logarithmic factors. Finally, we discuss an extension of our estimation scheme that covers more general risk measures satisfying a certain continuity criterion, e.g., spectral risk measures, utility-based shortfall risk. To the best of our knowledge, our work is the first to provide lower and upper bounds for estimating any risk measure beyond the mean within a Markovian setting. Our lower bounds also extend to the infinite-horizon discounted costs' mean. Even in that case, our lower bound of improves upon the existing bound [13].
Paper Structure (19 sections, 12 theorems, 52 equations, 2 tables)

This paper contains 19 sections, 12 theorems, 52 equations, 2 tables.

Key Result

Theorem 3.1

For every MCP $(M, f) \in \mathscr{M},$ let the risk measure $\eta(M, f)$ be either $\mathcal{F}(M, f)$'s VaR $v_\alpha(M, f)$ or CVaR $c_\alpha(M, f)$ at a given $\alpha \in (0, 1).$ Then, for every $n \in \mathbb{N},$ error threshold $\epsilon > 0,$ and discount factor $\gamma \in [0, 1),$ and where $\hat{\eta}(H_n, f) \equiv \hat{\eta}_n(H_n, f).$

Theorems & Definitions (42)

  • Definition 2.1: Risk Estimation Algorithm
  • Definition 2.2: Reset Policy
  • Definition 2.3: Estimator
  • Definition 2.4: Resetted Chain
  • Remark 2.5
  • Theorem 3.1: Minimax Lower Bound
  • proof
  • Remark 3.2
  • Remark 3.3
  • Theorem 3.4: Lower Bound
  • ...and 32 more