Table of Contents
Fetching ...

Wasserstein Distributionally Robust Quantile Regression

Chunxu Zhang, Tiantian Mao, Ruodu Wang

Abstract

We study distributionally robust quantile regression using type-$p$ Wasserstein ambiguity sets. We derive a closed-form expression for the worst-case quantile regression loss under general $p$-Wasserstein uncertainty. We further give a uniqueness result showing that for $p>1$, the check loss yields the only class of convex loss functions for which such an additive Wasserstein regularization holds. Our analysis also uncovers qualitative differences between the regimes $p=1$ and $p>1$. When $p>1$, the slope coefficients coincide with those of the regularized formulation, while the intercept undergoes a radius-dependent adjustment; the value $p$ affects only this intercept correction, whereas the choice of transport norm influences both. Finally, we establish finite-sample out-of-sample risk guarantees of order $O(N^{-1/2})$ under mild moment conditions. Numerical experiments illustrate the theoretical findings and the practical implications of the proposed formulation.

Wasserstein Distributionally Robust Quantile Regression

Abstract

We study distributionally robust quantile regression using type- Wasserstein ambiguity sets. We derive a closed-form expression for the worst-case quantile regression loss under general -Wasserstein uncertainty. We further give a uniqueness result showing that for , the check loss yields the only class of convex loss functions for which such an additive Wasserstein regularization holds. Our analysis also uncovers qualitative differences between the regimes and . When , the slope coefficients coincide with those of the regularized formulation, while the intercept undergoes a radius-dependent adjustment; the value affects only this intercept correction, whereas the choice of transport norm influences both. Finally, we establish finite-sample out-of-sample risk guarantees of order under mild moment conditions. Numerical experiments illustrate the theoretical findings and the practical implications of the proposed formulation.
Paper Structure (23 sections, 18 theorems, 118 equations, 3 figures, 1 table)

This paper contains 23 sections, 18 theorems, 118 equations, 3 figures, 1 table.

Key Result

Theorem 1

For $p\geqslant 1$, $\varepsilon\geqslant 0$, $F_0\in \mathcal{M}(\mathbb{R}^{d+1})$, the problem OP is equivalent to the following convex program in the sense that they share the same optimal value and that if $(\boldsymbol{\beta}^*,\overline{s}^*)$ solves prob-1, then $(\boldsymbol{\beta}^*,s^* )$ is the optimal solution to the problem OP, where

Figures (3)

  • Figure 1: Out-of-sample quantile loss under different Wasserstein balls.
  • Figure 2: Test quantile loss versus Wasserstein radius for DR-QR and R-QR across different training sample sizes.
  • Figure 3: Wasserstein radius versus training sample size for cross-validated, optimal, and empirically valid bounds.

Theorems & Definitions (31)

  • Theorem 1
  • Theorem 2
  • Proposition 1: Worst-case distribution for $p=1$
  • Proposition 2: Worst-case distribution for $p=\infty$
  • Proposition 3: Worst-case distribution for $p\in(1,\infty)$
  • Theorem 3
  • Proposition 4
  • Proposition 5
  • Theorem 4
  • Lemma A1
  • ...and 21 more