Table of Contents
Fetching ...

Correcting the Coverage Bias of Quantile Regression

Isaac Gibbs, John J. Cherian, Emmanuel J. Candès

TL;DR

This work tackles the problem that quantile regression can fail to achieve target coverage in high-dimensional regimes where the ratio $d/n$ remains positive. It introduces three deterministic debiasing methods—level adjustment, additive adjustment, and fixed dual thresholding—that calibrate predictions to achieve asymptotically exact coverage, leveraging a novel link between leave-one-out coverage and the quantile regression dual to enable efficient computation. Theoretical results establish high-dimensional consistency and LOOCov-duality, while empirical studies on simulated data and real datasets demonstrate robust coverage and practical tradeoffs between interval length and computation. The methods offer model-agnostic calibration for quantile predictions with potential impact on risk assessment and tail-bound analyses in high-dimensional settings.

Abstract

We develop a collection of methods for adjusting the predictions of quantile regression to ensure coverage. Our methods are model agnostic and can be used to correct for high-dimensional overfitting bias with only minimal assumptions. Theoretical results show that the estimates we develop are consistent and facilitate accurate calibration in the proportional asymptotic regime where the ratio of the dimension of the data and the sample size converges to a constant. This is further confirmed by experiments on both simulated and real data. A key component of our work is a new connection between the leave-one-out coverage and the fitted values of variables appearing in a dual formulation of the quantile regression problem. This facilitates the use of cross-validation in a variety of settings at significantly reduced computational costs.

Correcting the Coverage Bias of Quantile Regression

TL;DR

This work tackles the problem that quantile regression can fail to achieve target coverage in high-dimensional regimes where the ratio remains positive. It introduces three deterministic debiasing methods—level adjustment, additive adjustment, and fixed dual thresholding—that calibrate predictions to achieve asymptotically exact coverage, leveraging a novel link between leave-one-out coverage and the quantile regression dual to enable efficient computation. Theoretical results establish high-dimensional consistency and LOOCov-duality, while empirical studies on simulated data and real datasets demonstrate robust coverage and practical tradeoffs between interval length and computation. The methods offer model-agnostic calibration for quantile predictions with potential impact on risk assessment and tail-bound analyses in high-dimensional settings.

Abstract

We develop a collection of methods for adjusting the predictions of quantile regression to ensure coverage. Our methods are model agnostic and can be used to correct for high-dimensional overfitting bias with only minimal assumptions. Theoretical results show that the estimates we develop are consistent and facilitate accurate calibration in the proportional asymptotic regime where the ratio of the dimension of the data and the sample size converges to a constant. This is further confirmed by experiments on both simulated and real data. A key component of our work is a new connection between the leave-one-out coverage and the fitted values of variables appearing in a dual formulation of the quantile regression problem. This facilitates the use of cross-validation in a variety of settings at significantly reduced computational costs.

Paper Structure

This paper contains 21 sections, 34 theorems, 204 equations, 7 figures.

Key Result

Proposition 3.1

Assume that $\mathcal{R}$ is convex. Then, all dual solutions $\hat{\eta}$ and leave-one-out primal solutions $\hat{w}^{(-i)}$ satisfy the conditions and

Figures (7)

  • Figure 1: Miscoverage of (unregularized) quantile regression fit with model $Y_i \sim \beta_0 + X_i^\top \beta$ on i.i.d. data $\{(X_i,Y_i)\}_{i=1}^n$ sampled as $Y_i = X_i^\top\tilde{\beta} + \epsilon_i$ for $X_i \sim \mathcal{N}(0,I_d)$, $\epsilon \sim \mathcal{N}(0,1)$, and $\epsilon_i \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} X_i$. Boxplots in the figure show the empirical distribution of the training-conditional coverage, $\mathbb{P}(Y_{n+1} \leq \hat{\beta}_0 + X_{n+1}^\top \hat{\beta} \mid \{(X_i,Y_i)\}_{i=1}^n)$ where $(\hat{\beta}_0, \hat{\beta})$ denote the estimated coefficients at quantile level $\tau = 0.9$ and $(X_{n+1},Y_{n+1})$ is an independent sample from the same model. The results come from 100 trials where in each trial the coverage is evaluated over a test set of size 2000 and the population coefficients are sampled as $\tilde{\beta} \sim \mathcal{N}(0,I_d/d)$. The red line shows the target miscoverage level of $1-\tau = 0.1$.
  • Figure 2: Average value of $\hat{\tau}^{\text{adj.}}$ (left panel) and empirical miscoverage (right panel) of quantile regression fit with an adjusted level as the dimension of the data varies. Data for these experiments are sampled from the Gaussian linear model $Y_i = X_i^\top\tilde{\beta} + \epsilon_i$ with $X_i \sim \mathcal{N}(0,I_d)$, $\epsilon_i \sim \mathcal{N}(0,1)$, and $\epsilon_i \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} X_i$. Dots and error bars in the left panel show estimated means and 95% confidence intervals from 100 trials where in each trial the population coefficients are sampled as $\tilde{\beta} \sim \mathcal{N}(0,I_d/d)$. Boxplots in the right panel show the empirical distribution of the training-conditional miscoverage evaluated over the same 100 trials where in each trial the miscoverage is estimated on a test set of size 2000. The red line shows the target miscoverage of $1-\tau = 0.1$.
  • Figure 3: Average coefficient estimation error of quantile regression fit with an adjusted level (blue), adjusted regularization (orange), and a joint level and regularization adjustment (green) as the dimension of the data varies. Data for these experiments are sampled as in Figure \ref{['fig:level_tuning']} and the target miscoverage is set as $1-\tau = 0.9$. Dots and error bars show estimated means and 95% confidence intervals from 100 trials. All regularization levels are chosen from the grid $n^{-1}\Lambda = \{0,0.005,0.01,\dots,0.1\}$.
  • Figure 4: Empirical estimate of the mean selected value of $\hat{c}$ (left panel), realized miscoverage for varying dimension (center panel), and realized miscoverage as $c$ varies (right panel) of the unregularized additive adjustment. Data for this experiment are sampled from the Gaussian linear model $Y_i = X_i^\top\tilde{\beta} + \epsilon_i$ with $X_i \sim \mathcal{N}(0,I_d)$, $\epsilon_i \sim \mathcal{N}(0,1)$, and $\epsilon_i \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} X_i$. Dots and error bars in the left panel show estimated means and 95% confidence intervals taken over 100 trials where in each trial the population coefficients are sampled as $\tilde{\beta} \sim \mathcal{N}(0,I_d/d)$. Boxplots in the center and right panel show the empirical distribution of the training-conditional miscoverage evaluated over the same 100 trials where in each trial the miscoverage is estimated on a test set of size 2000. The black line in the left panel shows the maximum allowable value for $\hat{c}$, while red lines in the center and right panel show the target miscoverage of $1-\tau = 0.1$.
  • Figure 5: Empirical estimates of the average adjusted quantile (left panel) and miscoverage (right panel) of the randomized method of GCC2025 conditional on the cutoff $U$. Data for this experiment are sampled from the Gaussian linear model $Y_i = X_i^\top\beta + \epsilon_i$ where $X_i \sim \mathcal{N}(0,I_d)$ and $\epsilon_i \sim \mathcal{N}(0,1)$ with $X_i \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} \epsilon_i$. Dots and error bars show means and 95% confidence intervals obtained over 2000 samples of the combined training and test dataset $\{(X_i,Y_i)\}_{i=1}^{n+1}$ where in each sample the population coefficients are generated as $\tilde{\beta} \sim \mathcal{N}(0,I_d/d)$. Throughout, we set $d=40$ and $n=200$. The red line in the right panel indicates the target miscoverage level of $1-\tau = 0.1$.
  • ...and 2 more figures

Theorems & Definitions (63)

  • Proposition 3.1
  • Example 3.1
  • Example 3.2
  • Theorem 3.1
  • Theorem 3.1
  • Theorem 4.1
  • Corollary 4.1
  • Corollary 4.2
  • Theorem 4.2
  • Lemma A.1
  • ...and 53 more