Table of Contents
Fetching ...

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

Vagner Santos, Victor Coscrato, Luben Cabezas, Rafael Izbicki, Thiago Ramos

TL;DR

LoBoost is proposed, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups and requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split.

Abstract

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.

LoBoost: Fast Model-Native Local Conformal Prediction for Gradient-Boosted Trees

TL;DR

LoBoost is proposed, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups and requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split.

Abstract

Gradient-boosted decision trees are among the strongest off-the-shelf predictors for tabular regression, but point predictions alone do not quantify uncertainty. Conformal prediction provides distribution-free marginal coverage, yet split conformal uses a single global residual quantile and can be poorly adaptive under heteroscedasticity. Methods that improve adaptivity typically fit auxiliary nuisance models or introduce additional data splits/partitions to learn the conformal score, increasing cost and reducing data efficiency. We propose LoBoost, a model-native local conformal method that reuses the fitted ensemble's leaf structure to define multiscale calibration groups. Each input is encoded by its sequence of visited leaves; at resolution level k, we group points by matching prefixes of leaf indices across the first k trees and calibrate residual quantiles within each group. LoBoost requires no retraining, auxiliary models, or extra splitting beyond the standard train/calibration split. Experiments show competitive interval quality, improved test MSE on most datasets, and large calibration speedups.
Paper Structure (42 sections, 5 theorems, 84 equations, 3 figures, 9 tables, 1 algorithm)

This paper contains 42 sections, 5 theorems, 84 equations, 3 figures, 9 tables, 1 algorithm.

Key Result

Proposition 1

Fix $x \in \mathcal{X}$. For any $\varepsilon>0$ and any $k\le T$, we have where we write:

Figures (3)

  • Figure 1: Comparison of prediction intervals and conditional coverage for LoBoost and ICP on two synthetic mechanisms. Top: estimated intervals with the oracle interval and test data—LoBoost tracks the oracle more closely, handling heteroscedasticity and discontinuities better than ICP. Bottom: empirical conditional coverage across the feature space (target $1-\alpha=0.9$)—LoBoost stays near 0.9, while ICP deviates, reflecting LoBoost ’s meaningful feature-space partitions.
  • Figure 2: LoBoost partition. For convenience, we consider $h_i(x)$ as decision stumps. White nodes indicate dense regions (at least $N_{part}$ calibration instances) that proceed to the next resolution level. Red nodes denote terminal regions (less than $N_{part}$ calibration instances). The example results in 5 terminal regions.
  • Figure 3: Empirical second-moment decay curves for fixed reference points $x$ within boosting-induced regions $\mathcal{R}_k(x)$ (here, $k=3$). The first row corresponds to simulation setting 1 and the second row to simulation setting 2. In each panel, we estimate $\mathbb{E}\left[(h_t(X)-h_t(x))^2 \mid X\in\mathcal{R}_k(x)\right]$ empirically as a function of $t$ and fit an exponential decay model $V_x(t)\approx C\rho^t$.

Theorems & Definitions (9)

  • Proposition 1: Continuity of Gradient Boosting
  • Corollary 1
  • Theorem 1: Local coverage
  • Corollary 2: Local behavior of the residual
  • Theorem 2: Asymptotic conditional coverage
  • proof : Proof of Proposition \ref{['proposition:gb_continuity']}
  • proof : Proof of Corollary \ref{['corollary:boosting_continuity_logk']}
  • proof : Proof of Corollary \ref{['corollary:residual_continuity']}
  • proof : Proof of Theorem \ref{['theorem:conditional_coverage']}