Table of Contents
Fetching ...

Deep neural expected shortfall regression with tail-robustness

Myeonghun Yu, Kean Ming Tan, Huixia Judy Wang, Wen-Xin Zhou

TL;DR

This work tackles conditional tail risk estimation by developing a nonparametric, deep-learning framework for expected shortfall (ES) regression that handles heavy-tailed data. It uses a two-step, orthogonal approach: a deep quantile regression (DQR) to estimate the conditional quantile $q_\alpha(Y|X)$, followed by a robust ES regression (DES/DRES) that regresses a surrogate response $Z_i(f)$ with a tunable Huber loss to achieve tail robustness. The authors establish non-asymptotic error bounds and convergence rates under hierarchical function classes, showing that deep networks can mitigate the curse of dimensionality in ES regression. They also implement practical mechanisms for non-crossing ES/QR functions and demonstrate strong empirical performance on Monte Carlo simulations and a climate risk application analyzing El Niño–related extreme precipitation.

Abstract

Expected shortfall (ES), also known as conditional value-at-risk, is a widely recognized risk measure that complements value-at-risk by capturing tail-related risks more effectively. Compared with quantile regression, which has been extensively developed and applied across disciplines, ES regression remains in its early stage, partly because the traditional empirical risk minimization framework is not directly applicable. In this paper, we develop a nonparametric framework for expected shortfall regression based on a two-step approach that treats the conditional quantile function as a nuisance parameter. Leveraging the representational power of deep neural networks, we construct a two-step ES estimator using feedforward ReLU networks, which can alleviate the curse of dimensionality when the underlying functions possess hierarchical composition structures. However, ES estimation is inherently sensitive to heavy-tailed response or error distributions. To address this challenge, we integrate a properly tuned Huber loss into the neural network training, yielding a robust deep ES estimator that is provably resistant to heavy-tailedness in a non-asymptotic sense and first-order insensitive to quantile estimation errors in the first stage. Comprehensive simulation studies and an empirical analysis of the effect of El Niño on extreme precipitation illustrate the accuracy and robustness of the proposed method.

Deep neural expected shortfall regression with tail-robustness

TL;DR

This work tackles conditional tail risk estimation by developing a nonparametric, deep-learning framework for expected shortfall (ES) regression that handles heavy-tailed data. It uses a two-step, orthogonal approach: a deep quantile regression (DQR) to estimate the conditional quantile , followed by a robust ES regression (DES/DRES) that regresses a surrogate response with a tunable Huber loss to achieve tail robustness. The authors establish non-asymptotic error bounds and convergence rates under hierarchical function classes, showing that deep networks can mitigate the curse of dimensionality in ES regression. They also implement practical mechanisms for non-crossing ES/QR functions and demonstrate strong empirical performance on Monte Carlo simulations and a climate risk application analyzing El Niño–related extreme precipitation.

Abstract

Expected shortfall (ES), also known as conditional value-at-risk, is a widely recognized risk measure that complements value-at-risk by capturing tail-related risks more effectively. Compared with quantile regression, which has been extensively developed and applied across disciplines, ES regression remains in its early stage, partly because the traditional empirical risk minimization framework is not directly applicable. In this paper, we develop a nonparametric framework for expected shortfall regression based on a two-step approach that treats the conditional quantile function as a nuisance parameter. Leveraging the representational power of deep neural networks, we construct a two-step ES estimator using feedforward ReLU networks, which can alleviate the curse of dimensionality when the underlying functions possess hierarchical composition structures. However, ES estimation is inherently sensitive to heavy-tailed response or error distributions. To address this challenge, we integrate a properly tuned Huber loss into the neural network training, yielding a robust deep ES estimator that is provably resistant to heavy-tailedness in a non-asymptotic sense and first-order insensitive to quantile estimation errors in the first stage. Comprehensive simulation studies and an empirical analysis of the effect of El Niño on extreme precipitation illustrate the accuracy and robustness of the proposed method.

Paper Structure

This paper contains 38 sections, 24 theorems, 230 equations, 5 figures, 3 tables.

Key Result

Proposition 1

Assume that the quantile residual $\epsilon$ satisfies Condition cond:moment.condition for some $p > 1$. Then the global minimizer $g_{0,\tau}$ defined in def:Huber.ES.estimator is unique up to sets of probability zero with respect to $X$, and satisfies provided that $\tau \geq (4\nu_p)^{1/p}$ for $1 < p < 2$, or $\tau \geq (4\nu_2)^{1/2}$ for $p \geq 2$.

Figures (5)

  • Figure 1: Boxplots of $\widehat{\mathrm{MSPE}}$ (based on 200 repetitions) for the four estimators--LLES, DES, DRES, and NC-DRES--in estimating the conditional 10% ES function under the location–scale model $Y = h_1(X) + h_2(X)\eta$, where $X \in [0,1]^8$ and sample size is $n = 4{,}096$.
  • Figure 2: Plots of empirical mean squared prediction error ($\widehat{{\rm MSPE}})$ versus training sample size, ranging from 2,048 to 9,216 are shown based on 100 replications. As before, the target is the conditional 10% ES function from a location-scale model.
  • Figure 3: Histogram and log-log plot of precipitation. The blue and red horizontal lines represent the sample mean and the $99\%$ quantile of precipitation.
  • Figure 4: Subfigures (a)--(d) show the differences in the predicted ES of precipitation between El Niño and non-El Niño conditions across the four seasons, while subfigures (e)--(h) show the corresponding differences in the predicted mean precipitation. In the plots, red indicates increased precipitation during El Niño, and blue represents decreased precipitation.
  • Figure 5: Variable permutation importance for conditional mean and ES regressions.

Theorems & Definitions (37)

  • Example 1: Location-scale model
  • Example 2: Quantile regression process
  • Proposition 1
  • Remark 2: Examples of heavy-tailed distributions
  • Remark 3
  • Definition 4: Hölder class of functions $\mathcal{H}^\beta(\mathcal{X}, M_0)$
  • Definition 5: Hierarchical interaction model
  • Remark 7
  • Theorem 8: Oracle-type inequality for the DQR estimator
  • Theorem 9: Convergence rate for the DQR estimator
  • ...and 27 more