Table of Contents
Fetching ...

Comparative e-backtests for general risk measures

Zhanyi Jiao, Qiuqi Wang, Yimiao Zhao

TL;DR

This work develops a non-parametric sequential framework for comparative backtests of general elicitable risk measures using e-values and e-processes and proposes a modified three-zone approach based on weak dominance, which yields more informative conclusions in comparative backtesting.

Abstract

Backtesting risk measures is a central task in financial regulation. While standard backtests evaluate whether a forecasting model is statistically consistent with observed losses, regulatory practice often requires assessing the performance of an internal model relative to benchmark models. We develop a non-parametric sequential framework for comparative backtests of general elicitable risk measures using e-values and e-processes. The proposed methods provide anytime-valid inference and remain robust under dependence and model misspecification. In particular, we propose a modified three-zone approach based on weak dominance, which yields more informative conclusions in comparative backtesting. As a technical building block, we also construct general standard e-backtests for identifiable risk measures and characterize the associated e-values and e-processes. The resulting procedures apply to a broad class of commonly used risk measures, including the mean, variance, Value-at-Risk, Expected Shortfall, and expectiles. Simulation studies and empirical analyses illustrate the effectiveness of the proposed approach.

Comparative e-backtests for general risk measures

TL;DR

This work develops a non-parametric sequential framework for comparative backtests of general elicitable risk measures using e-values and e-processes and proposes a modified three-zone approach based on weak dominance, which yields more informative conclusions in comparative backtesting.

Abstract

Backtesting risk measures is a central task in financial regulation. While standard backtests evaluate whether a forecasting model is statistically consistent with observed losses, regulatory practice often requires assessing the performance of an internal model relative to benchmark models. We develop a non-parametric sequential framework for comparative backtests of general elicitable risk measures using e-values and e-processes. The proposed methods provide anytime-valid inference and remain robust under dependence and model misspecification. In particular, we propose a modified three-zone approach based on weak dominance, which yields more informative conclusions in comparative backtesting. As a technical building block, we also construct general standard e-backtests for identifiable risk measures and characterize the associated e-values and e-processes. The resulting procedures apply to a broad class of commonly used risk measures, including the mean, variance, Value-at-Risk, Expected Shortfall, and expectiles. Simulation studies and empirical analyses illustrate the effectiveness of the proposed approach.

Paper Structure

This paper contains 38 sections, 11 theorems, 62 equations, 16 figures, 2 tables.

Key Result

Lemma 1

Let $(\rho,\phi):\mathcal{M}\to\mathbb{R}\times I(\mathbb{R})$ be a Bayes pair with a loss function $S:\mathbb{R}^2\to\mathbb{R}$. If $\phi$ has (strict) identification function $v:\mathbb{R}^2\to\mathbb{R}$, then $(\rho,\phi)$ has (strict) identification function $(x,r,z)\mapsto (v(x,z),h(r,z)(S(x,

Figures (16)

  • Figure 1: Left panel: realized losses and risk measure forecasts with respect to the number of data with iid losses; right panel: average e-processes (log-scale) over 1000 runs with respect to the number of data, different colors represent different underestimated scenarios
  • Figure 2: Heat map matrices for $\mathrm{VaR}_\alpha$ forecasts at levels $\alpha = 0.9$ and $\alpha = 0.99$ for simulated time series data with rejection threshold 2. The betting processes are calculated with $c = 0.5$, based on the score function in \ref{['eq:scoreVaR']}. The horizontal axis represents internal model and the vertical axis represents standard model
  • Figure 3: Heat map matrices for $\mathrm{ex}_\tau$ forecasts at levels $\tau = 0.96561$ and $\tau = 0.99855$ for simulated time series data with rejection threshold 2. The betting processes are calculated with $c = 0.5$, based on the score function in \ref{['eq:scoreexp']}. The horizontal axis represents internal model and the vertical axis represents standard model
  • Figure 4: Heat map matrices for $(\mathrm{VaR}_\nu, \mathrm{ES}_{\nu})$ forecasts at levels $\nu = 0.754$ and $\nu = 0.975$ for simulated time series data with rejection threshold 2. The betting processes are calculated with $c = 0.5$, based on the score function in \ref{['eq:scoreVaRES']}. The horizontal axis represents internal model and the vertical axis represents standard model
  • Figure 5: E-processes (log-scale) of comparative backtests for $\mathrm{VaR}$, $\mathrm{ex}$ and $(\mathrm{ES},\mathrm{VaR})$ forecasts with respect to the number of data via simulated time series data. The dashed lines represent rejection thresholds at $2$, $5$ and $10$. The betting processes are calculated with $c=0.5$, based on the scoring functions in Example \ref{['ex:2']}. The title of each plot indicates the comparison: internal model vs. standard model
  • ...and 11 more figures

Theorems & Definitions (29)

  • Lemma 1
  • Example 1: Standard test
  • Example 2: Comparative test
  • Lemma 2
  • Theorem 1: Single dimension
  • Proposition 1
  • Theorem 2
  • Definition 1
  • Remark 1
  • Lemma 3
  • ...and 19 more