Table of Contents
Fetching ...

Composite Quantile Regression With XGBoost Using the Novel Arctan Pinball Loss

Laurens Sluijterman, Frank Kreuwel, Eric Cator, Tom Heskes

TL;DR

The paper tackles the challenge of performing composite quantile regression with XGBoost, where the traditional pinball loss is ill-suited for the model's second-order optimization due to nondifferentiability and vanishing curvature. It introduces the arctan pinball loss, a smooth alternative with a non-vanishing second derivative, enabling a single XGBoost model to predict multiple quantiles simultaneously and reducing quantile crossings. The authors derive theoretical properties and provide practical recommendations for hyperparameters and model setup. Empirical results on toy, UCI benchmark, and electricity-grid substations data demonstrate competitive coverage with substantially fewer crossings and highlight the method's scalability and efficiency, along with considerations for calibration and extrapolation. A public implementation accompanies the work to facilitate adoption in uncertainty quantification and risk-aware forecasting tasks.

Abstract

This paper explores the use of XGBoost for composite quantile regression. XGBoost is a highly popular model renowned for its flexibility, efficiency, and capability to deal with missing data. The optimization uses a second order approximation of the loss function, complicating the use of loss functions with a zero or vanishing second derivative. Quantile regression -- a popular approach to obtain conditional quantiles when point estimates alone are insufficient -- unfortunately uses such a loss function, the pinball loss. Existing workarounds are typically inefficient and can result in severe quantile crossings. In this paper, we present a smooth approximation of the pinball loss, the arctan pinball loss, that is tailored to the needs of XGBoost. Specifically, contrary to other smooth approximations, the arctan pinball loss has a relatively large second derivative, which makes it more suitable to use in the second order approximation. Using this loss function enables the simultaneous prediction of multiple quantiles, which is more efficient and results in far fewer quantile crossings.

Composite Quantile Regression With XGBoost Using the Novel Arctan Pinball Loss

TL;DR

The paper tackles the challenge of performing composite quantile regression with XGBoost, where the traditional pinball loss is ill-suited for the model's second-order optimization due to nondifferentiability and vanishing curvature. It introduces the arctan pinball loss, a smooth alternative with a non-vanishing second derivative, enabling a single XGBoost model to predict multiple quantiles simultaneously and reducing quantile crossings. The authors derive theoretical properties and provide practical recommendations for hyperparameters and model setup. Empirical results on toy, UCI benchmark, and electricity-grid substations data demonstrate competitive coverage with substantially fewer crossings and highlight the method's scalability and efficiency, along with considerations for calibration and extrapolation. A public implementation accompanies the work to facilitate adoption in uncertainty quantification and risk-aware forecasting tasks.

Abstract

This paper explores the use of XGBoost for composite quantile regression. XGBoost is a highly popular model renowned for its flexibility, efficiency, and capability to deal with missing data. The optimization uses a second order approximation of the loss function, complicating the use of loss functions with a zero or vanishing second derivative. Quantile regression -- a popular approach to obtain conditional quantiles when point estimates alone are insufficient -- unfortunately uses such a loss function, the pinball loss. Existing workarounds are typically inefficient and can result in severe quantile crossings. In this paper, we present a smooth approximation of the pinball loss, the arctan pinball loss, that is tailored to the needs of XGBoost. Specifically, contrary to other smooth approximations, the arctan pinball loss has a relatively large second derivative, which makes it more suitable to use in the second order approximation. Using this loss function enables the simultaneous prediction of multiple quantiles, which is more efficient and results in far fewer quantile crossings.
Paper Structure (19 sections, 26 equations, 9 figures, 2 tables)

This paper contains 19 sections, 26 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: The pinball loss for two different values of $\tau$.
  • Figure 2: A comparison of $L^{(\text{exp})}_{\tau, s}(u)$ and $L^{(\text{arctan})}_{\tau, s}(u)$ for $\tau=0.9$ and $s=0.1$. Both the exponential approximation and the arctan approximation approximate the pinball loss very closely. However, as is displayed in (b), the second derivative of the arctan pinball loss is much larger.
  • Figure 3: Three scenarios where crossings can occur, not at scale. While the optimum of the arctan pinball loss has no crossings, individual updates can result in crossings due to the quadratic approximation. The resulting update is proportional to the gradient divided by the second derivative, denoted with $h$.
  • Figure 4: A comparison of $L^{(\text{exp})}_{\tau, s}(u)$ and $L^{(\text{arctan})}_{\tau, s}(u)$ for $\tau=0.9$ and $s=0.1$ at a very small scale. Close to the origin, the bias in both approximations is clear. Both approximations have the actual minimum of the loss function slightly below $u=0$. This results in slightly more conservative quantiles, meaning larger quantiles for $\tau > 0.5$ and smaller quantiles for $\tau<0.5$ compared to when using the regular pinball loss. This effect is larger when using a larger value of $s$.
  • Figure 5: Illustration of the train/val/test split used for the Alliander data.
  • ...and 4 more figures