Table of Contents
Fetching ...

Expected Shortfall Regression via Optimization

Yuanzhi Li, Shushu Zhang, Xuming He

Abstract

To provide a comprehensive summary of the tail distribution, the expected shortfall is defined as the average over the tail above (or below) a certain quantile of the distribution. The expected shortfall regression captures the heterogeneous covariate-response relationship and describes the covariate effects on the tail of the response distribution. Based on a critical observation that the superquantile regression from the operations research literature does not coincide with the expected shortfall regression, we propose and validate a novel optimization-based approach for the linear expected shortfall regression, without additional assumptions on the conditional quantile models. While the proposed loss function is implicitly defined, we provide a prototype implementation of the proposed approach with some initial expected shortfall estimators based on binning techniques. With practically feasible initial estimators, we establish the consistency and the asymptotic normality of the proposed estimator. The proposed approach achieves heterogeneity-adaptive weights and therefore often offers efficiency gain over existing linear expected shortfall regression approaches in the literature, as demonstrated through simulation studies.

Expected Shortfall Regression via Optimization

Abstract

To provide a comprehensive summary of the tail distribution, the expected shortfall is defined as the average over the tail above (or below) a certain quantile of the distribution. The expected shortfall regression captures the heterogeneous covariate-response relationship and describes the covariate effects on the tail of the response distribution. Based on a critical observation that the superquantile regression from the operations research literature does not coincide with the expected shortfall regression, we propose and validate a novel optimization-based approach for the linear expected shortfall regression, without additional assumptions on the conditional quantile models. While the proposed loss function is implicitly defined, we provide a prototype implementation of the proposed approach with some initial expected shortfall estimators based on binning techniques. With practically feasible initial estimators, we establish the consistency and the asymptotic normality of the proposed estimator. The proposed approach achieves heterogeneity-adaptive weights and therefore often offers efficiency gain over existing linear expected shortfall regression approaches in the literature, as demonstrated through simulation studies.
Paper Structure (67 sections, 25 theorems, 303 equations, 17 figures, 6 tables, 2 algorithms)

This paper contains 67 sections, 25 theorems, 303 equations, 17 figures, 6 tables, 2 algorithms.

Key Result

Theorem 2.1

Suppose the cumulative distribution function of $Y \mid X=x$ is continuous and strictly increasing in a neighborhood of $q_{Y|X} (\tau, x)$, and the matrix is positive definite. Then, under linear ES regression model eq::SQ-model-linear, where $\beta$ is the true ES regression coefficient and the minimizer is uniquely identified.

Figures (17)

  • Figure 1: The violin plot of ARE of the i-Rock and the joint approaches relative to the TS approach, at $\tau = 0.9$ under Model \ref{['eq::comparison-simple']} with $p = 3$. Here, the ARE of one estimator $\widehat{\beta}_1$ relative to the other estimator $\widehat{\beta}_2$ is defined as $\lVert \text{AVar}(\widehat{\beta}_2) \rVert /\lVert \text{AVar}(\widehat{\beta}_1) \rVert$, where $\lVert\cdot \rVert$ can be Frobenius norm (on the left) and determinant (on the right). Each element of the covariates $\tilde{X}$ takes values independently from $\{i/10;~i=0,1,\ldots,10\}$ with equal probability. For $\gamma_1 = (\gamma_{10},\gamma_{11}^T)^T$ and $\gamma_2 = (\gamma_{20},\gamma_{21}^T)^T$, we fix $\gamma_{10} = \gamma_{20} = 3$ and randomly sample $200$ different values of $\gamma_{11}$ and $\gamma_{21}$ independently and uniformly in the cube $[-1,3]^{3}$.
  • Figure 2: Numerical comparisons of the i-Rock approach (with linear or B-spline quantile function estimation) and two-step estimator under linear heteroscedastic model \ref{['eq::2d_cont_disc']} at various quantile levels and sample sizes.
  • Figure 3: Numerical comparisons of the i-Rock approach (with linear or B-spline quantile function estimation) and two-step approach under Model \ref{['eq::sim_2d_nonlinear']} at $\tau=0.9$, $n = 10000$.
  • Figure 4: The quantities shown from part of $\beta_1 - \beta_0$ associated with Equation \ref{['eq::disparity_driver']} are the birth weight disparities (the lower $0.05$ ES) of the disadvantaged groups for subgroups defined by the number of prenatal visits: $[6, 10]$ and $>10$, with the subgroup of $\leq 5$ prenatal visits serving as the baseline.
  • Figure 5: The population level loss function $L_1(\theta_1)$ (left panel) and its derivative (right panel). The blue dashed line marks the true ES regression coefficient $\beta_1$, while the red one marks the minimizer of $L_1(\theta_1)$.
  • ...and 12 more figures

Theorems & Definitions (53)

  • Theorem 2.1
  • Corollary 1
  • Theorem 3.1
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • proof
  • proof
  • Theorem C.1
  • Corollary 2
  • ...and 43 more