Table of Contents
Fetching ...

Optimal nonparametric estimation of the expected shortfall risk

Daniel Bartl, Stephan Eckstein

TL;DR

The paper tackles nonparametric estimation of the expected shortfall $\mathrm{ES}_\alpha(X)$ from finite i.i.d. samples, showing that the standard plug-in estimator struggles with heavy-tailed losses. It introduces a novel estimator $\widehat{S}_N$, built from blockwise plug-ins and truncation via quantile-based bounds, which achieves statistically optimal non-asymptotic guarantees and adversarial robustness. The authors establish minimax lower bounds, quantify the impact of Lipschitz continuity of the quantile function, and provide explicit Pareto-case calculations to illustrate the results; they also extend the framework to non-i.i.d. data under mixing and discuss practical extensions. Numerical experiments demonstrate exponential-tail confidence for $\widehat{S}_N$, improved finite-sample performance over the plug-in, and strong resilience to data contamination, highlighting the estimator's potential for robust risk assessment in finance.

Abstract

We address the problem of estimating the expected shortfall risk of a financial loss using a finite number of i.i.d. data. It is well known that the classical plug-in estimator suffers from poor statistical performance when faced with (heavy-tailed) distributions that are commonly used in financial contexts. Further, it lacks robustness, as the modification of even a single data point can cause a significant distortion. We propose a novel procedure for the estimation of the expected shortfall and prove that it recovers the best possible statistical properties (dictated by the central limit theorem) under minimal assumptions and for all finite numbers of data. Further, this estimator is adversarially robust: even if a (small) proportion of the data is maliciously modified, the procedure continuous to optimally estimate the true expected shortfall risk. We demonstrate that our estimator outperforms the classical plug-in estimator through a variety of numerical experiments across a range of standard loss distributions.

Optimal nonparametric estimation of the expected shortfall risk

TL;DR

The paper tackles nonparametric estimation of the expected shortfall from finite i.i.d. samples, showing that the standard plug-in estimator struggles with heavy-tailed losses. It introduces a novel estimator , built from blockwise plug-ins and truncation via quantile-based bounds, which achieves statistically optimal non-asymptotic guarantees and adversarial robustness. The authors establish minimax lower bounds, quantify the impact of Lipschitz continuity of the quantile function, and provide explicit Pareto-case calculations to illustrate the results; they also extend the framework to non-i.i.d. data under mixing and discuss practical extensions. Numerical experiments demonstrate exponential-tail confidence for , improved finite-sample performance over the plug-in, and strong resilience to data contamination, highlighting the estimator's potential for robust risk assessment in finance.

Abstract

We address the problem of estimating the expected shortfall risk of a financial loss using a finite number of i.i.d. data. It is well known that the classical plug-in estimator suffers from poor statistical performance when faced with (heavy-tailed) distributions that are commonly used in financial contexts. Further, it lacks robustness, as the modification of even a single data point can cause a significant distortion. We propose a novel procedure for the estimation of the expected shortfall and prove that it recovers the best possible statistical properties (dictated by the central limit theorem) under minimal assumptions and for all finite numbers of data. Further, this estimator is adversarially robust: even if a (small) proportion of the data is maliciously modified, the procedure continuous to optimally estimate the true expected shortfall risk. We demonstrate that our estimator outperforms the classical plug-in estimator through a variety of numerical experiments across a range of standard loss distributions.
Paper Structure (18 sections, 22 theorems, 158 equations, 7 figures, 1 table)

This paper contains 18 sections, 22 theorems, 158 equations, 7 figures, 1 table.

Key Result

Theorem 1.1

Assume that $\sigma^2_{\mathrm{ES}_\alpha(X)}$ is finite and that $u\mapsto \mathrm{VaR}_{u}(X)$ is continuous in $1- \alpha$. Then, as $N\to\infty$, we have the weak convergence

Figures (7)

  • Figure 1: The left image depicts $\mathbb{P}(|\widehat{T}_N - \mathrm{ES}_{\alpha}(X)| \geq 1)$ (plug-in estimator) and $\mathbb{P}(|\widehat{S}_N - \mathrm{ES}_{\alpha}(X)| \geq 1)$ (the proposed estimator uses $\beta_1 = 0.5, \beta_2 = 0.6$ and $m=250$) for varying values of $N$, showcasing the exponential rate for $\widehat{S}_N$ and the lack thereof for $\widehat{T}_N$. Here $\alpha=0.1$, $X$ is Pareto distributed with $\lambda=2.2$, and the deviation probabilities are estimated using $10^6$ many simulations. The middle image showcases the histogram of estimated values across simulations for $N=3250$, with the blue vertical lines depicting the values $\mathrm{ES}_{\alpha}(X)-1, \mathrm{ES}_{\alpha}(X), \mathrm{ES}_{\alpha}(X)+1$. The right hand image shows the tail behavior of the estimators. Among the $10^6$ simulations for $N = 3250$, there are 13637 errors larger than one for $\widehat{T}_N$ (largest realized value 98.98), and 1568 for $\widehat{S}_N$ (largest realized value 7.83).
  • Figure 2: The figures show $\mathbb{P}(|\widehat{R}_N - \mathrm{ES}_{\alpha}(X)| \geq \delta)$ (estimated using $10^6$ many experiments) for different estimators $\widehat{R}_N$, where $X$ is either Pareto distributed with $\lambda=2.1$ or $\lambda=3.5$. The proposed estimator and the median of blocks estimator use sub-intervals of size $m=250$. The left hand side illustrates that the confidence bands for $\hat{S}_N$ and the median of blocks estimator behave like $\exp(-c_\delta N)$ as shown in Theorem \ref{['thm:ES']}, which is in contrast to the rate $\tilde{c}_\delta N^{-(\lambda - 1)}$ for $\widehat{T}_N$ as shown in Lemma \ref{['lem:paretolower']}. The figure further illustrates the regime switch for the proposed estimator. For large values of $\delta$ (left hand images, notice the scale on the $y$-axis), the proposed estimator behaves similarly to the median of blocks estimator; that is, statistically optimal with high confidence. For regimes with smaller $\delta$ (right hand images, notice the different scale on the $y$-axis), the proposed estimator manages to obtain an accuracy similar to the plug-in estimator.
  • Figure 3: The figures show $\mathbb{P}(|\widehat{R}_N - \mathrm{ES}_{\alpha}(X)| \geq \delta)$ (estimated using $10^6$ many experiments) for different estimators $\widehat{R}_N$, where $X$ follows a student-$t$ distribution with parameter $\nu=2.2$. The proposed estimator and the median of blocks estimator use sub-intervals of size $m=250$.
  • Figure 4: The figures show $\mathbb{P}(|\widehat{R}_N - \mathrm{ES}_{\alpha}(X)| \geq \delta)$ (estimated using $10^6$ many experiments) for different estimators $\widehat{R}_N$, where $X$ follows a Pareto distribution with parameter $\lambda=2.1$. The proposed estimator uses sub-intervals of size $m=250$. Hill's estimator (cf.hill2015expected) is a kind of trimmed estimator which cuts off the $[0.25 \cdot n^{1/3}]$ largest data points.
  • Figure 5: The left hand side shows $\mathbb{P}(|\widehat{R}_N - \mathrm{ES}_{\alpha}(X)| \geq \delta)$ (estimated using $10^6$ many experiments) for different estimators $\widehat{R}_N$, where $X$ is either standard normally distributed (top) and or standard log-normally distributed (bottom). The proposed estimator and the median of blocks estimator use sub-intervals of size $m=125$. We see that the median of blocks estimator exhibits a noticeable negative bias, as shown by the histograms on the right hand side. The proposed estimator mitigates this downside almost completely and performs similarly to the plug-in estimator in these examples.
  • ...and 2 more figures

Theorems & Definitions (48)

  • Theorem 1.1: brazauskas2008estimatingpflug2010asymptotic
  • Remark 1.2
  • Definition 1.3
  • Theorem 1.4
  • Remark 1.5
  • Remark 1.6
  • Corollary 1.7
  • Theorem 1.8
  • Lemma 2.1
  • proof
  • ...and 38 more