Adjusting Regression Models for Conditional Uncertainty Calibration

Ruijiang Gao; Mingzhang Yin; James McInerney; Nathan Kallus

Adjusting Regression Models for Conditional Uncertainty Calibration

Ruijiang Gao, Mingzhang Yin, James McInerney, Nathan Kallus

TL;DR

A novel algorithm to train a regression function to improve the conditional coverage after applying the split conformal prediction procedure is proposed and an upper bound for the miscoverage gap between the conditional coverage and the nominal coverage rate is established.

Abstract

Conformal Prediction methods have finite-sample distribution-free marginal coverage guarantees. However, they generally do not offer conditional coverage guarantees, which can be important for high-stakes decisions. In this paper, we propose a novel algorithm to train a regression function to improve the conditional coverage after applying the split conformal prediction procedure. We establish an upper bound for the miscoverage gap between the conditional coverage and the nominal coverage rate and propose an end-to-end algorithm to control this upper bound. We demonstrate the efficacy of our method empirically on synthetic and real-world datasets.

Adjusting Regression Models for Conditional Uncertainty Calibration

TL;DR

Abstract

Paper Structure (20 sections, 5 theorems, 25 equations, 6 figures, 4 tables, 2 algorithms)

This paper contains 20 sections, 5 theorems, 25 equations, 6 figures, 4 tables, 2 algorithms.

Introduction
Related Work
Problem Statement
Connection with Kolmogorov–Smirnov (KS) Distance
Optimizing KS Distance
Experiments
Synthetic Data
Real-World Data
Ablation Study
Conclusion
Proof
Baseline
Data Statistics
Additional Results on Synthetic Data
Additional Results on Ablation Studies
...and 5 more sections

Key Result

Theorem 1

Assume $\{(X_i, Y_i)\}_{i=1}^n$ are independent and identically distributed, then the split conformal prediction set satisfies If $V(X_{n+1},Y_{n+1})$ has a continuous distribution, then

Figures (6)

Figure 1: Coverage under Synthetic Data (Setting I) with Linear Regression, $1-\alpha=90\%$. Here we show the conditional coverage for each method. Our method can achieve the specified conditional coverage while all other methods have significantly lower conditional coverage.
Figure 2: WSLAB for UCI Datasets with residual score across $\alpha$. Our method consistently improves WSLAB among all $\alpha$.
Figure 3: Ablation Studies for Different Choices of $\lambda$ for Synthetic Data Setup I. We report the Marginal Coverage (MC), Conditional Coverage (CC), Set Size, and MSE for $\log(\lambda) = -1,0,1,2,3$ when $1-\alpha=90\%$.
Figure 4: Ablation Studies for Different Choices of $\lambda$ for Synthetic Data Setup II. We report the Marginal Coverage (MC), Conditional Coverage (CC), Set Size, and MSE for $\log(\lambda) = -1,0,1,2,3$ when $1-\alpha=90\%$.
Figure 5: Coverage under Synthetic Data (Setting II) with Linear Regression, $1-\alpha=90\%$. Here we show the conditional coverage for each method. Like most methods, KS-CP can achieve perfect conditional coverage in this case.
...and 1 more figures

Theorems & Definitions (9)

Theorem 1: Marginal Coverage Guarantee vovk2005algorithmic
Proposition 2
proof
Proposition 3: Conditional Coverage Rate
Proposition 4
Proposition 5
proof : Proof for \ref{['prop:ccr']}
proof : Proof for \ref{['prop:dpi']}
proof : Proof for \ref{['prop:asy']}

Adjusting Regression Models for Conditional Uncertainty Calibration

TL;DR

Abstract

Adjusting Regression Models for Conditional Uncertainty Calibration

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (9)