Optimal nonparametric estimation of the expected shortfall risk
Daniel Bartl, Stephan Eckstein
TL;DR
The paper tackles nonparametric estimation of the expected shortfall $\mathrm{ES}_\alpha(X)$ from finite i.i.d. samples, showing that the standard plug-in estimator struggles with heavy-tailed losses. It introduces a novel estimator $\widehat{S}_N$, built from blockwise plug-ins and truncation via quantile-based bounds, which achieves statistically optimal non-asymptotic guarantees and adversarial robustness. The authors establish minimax lower bounds, quantify the impact of Lipschitz continuity of the quantile function, and provide explicit Pareto-case calculations to illustrate the results; they also extend the framework to non-i.i.d. data under mixing and discuss practical extensions. Numerical experiments demonstrate exponential-tail confidence for $\widehat{S}_N$, improved finite-sample performance over the plug-in, and strong resilience to data contamination, highlighting the estimator's potential for robust risk assessment in finance.
Abstract
We address the problem of estimating the expected shortfall risk of a financial loss using a finite number of i.i.d. data. It is well known that the classical plug-in estimator suffers from poor statistical performance when faced with (heavy-tailed) distributions that are commonly used in financial contexts. Further, it lacks robustness, as the modification of even a single data point can cause a significant distortion. We propose a novel procedure for the estimation of the expected shortfall and prove that it recovers the best possible statistical properties (dictated by the central limit theorem) under minimal assumptions and for all finite numbers of data. Further, this estimator is adversarially robust: even if a (small) proportion of the data is maliciously modified, the procedure continuous to optimally estimate the true expected shortfall risk. We demonstrate that our estimator outperforms the classical plug-in estimator through a variety of numerical experiments across a range of standard loss distributions.
