Conformal Prediction for Deep Classifier via Label Ranking

Jianguo Huang; Huajun Xi; Linjun Zhang; Huaxiu Yao; Yue Qiu; Hongxin Wei

Conformal Prediction for Deep Classifier via Label Ranking

Jianguo Huang, Huajun Xi, Linjun Zhang, Huaxiu Yao, Yue Qiu, Hongxin Wei

TL;DR

This work addresses the inefficiency of conformal prediction in deep multiclass classifiers caused by long-tailed softmax probabilities. It introduces Sorted Adaptive Prediction Sets (SAPS), which discard all probability values except the maximum softmax probability and rely on label ranking to construct prediction sets with guaranteed marginal coverage. The authors provide theoretical insights showing probability magnitudes are largely unnecessary for CP and demonstrate that SAPS achieves dramatically smaller sets and higher conditional coverage than APS and RAPS across ImageNet and CIFAR benchmarks, including under distribution shifts. The method is simple to implement on top of any pretrained classifier and improves instance-wise uncertainty communication, with practical implications for risk-sensitive applications.

Abstract

Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee. The predicted probabilities produced by machine learning models are generally miscalibrated, leading to large prediction sets in conformal prediction. To address this issue, we propose a novel algorithm named $\textit{Sorted Adaptive Prediction Sets}$ (SAPS), which discards all the probability values except for the maximum softmax probability. The key idea behind SAPS is to minimize the dependence of the non-conformity score on the probability values while retaining the uncertainty information. In this manner, SAPS can produce compact prediction sets and communicate instance-wise uncertainty. Extensive experiments validate that SAPS not only lessens the prediction sets but also broadly enhances the conditional coverage rate of prediction sets.

Conformal Prediction for Deep Classifier via Label Ranking

TL;DR

Abstract

(SAPS), which discards all the probability values except for the maximum softmax probability. The key idea behind SAPS is to minimize the dependence of the non-conformity score on the probability values while retaining the uncertainty information. In this manner, SAPS can produce compact prediction sets and communicate instance-wise uncertainty. Extensive experiments validate that SAPS not only lessens the prediction sets but also broadly enhances the conditional coverage rate of prediction sets.

Paper Structure (30 sections, 4 theorems, 20 equations, 4 figures, 10 tables)

This paper contains 30 sections, 4 theorems, 20 equations, 4 figures, 10 tables.

Introduction
Preliminaries
Conformal prediction.
Adaptive prediction sets (APS).
Motivation and method
Motivation
Probability values may not be necessary.
A theoretical interpretation.
Method
Experiments
Experimental Setup
Datasets.
Models.
Conformal prediction algorithms.
Results
...and 15 more sections

Key Result

Theorem 2.1

angelopoulos2020uncertainty Suppose the calibration data $(X_i,Y_i,U_i)_{i=1,\dots,n}$ and a test instance $(X_{n+1},Y_{n+1},U_{n+1})$ are exchangeable. Let the set-valued function $\mathcal{C}_{1-\alpha}(\boldsymbol{x},u;\tau)$ satisfy the nesting property of $\tau$ in Equation eq:nesting_property.

Figures (4)

Figure 1: (a) Softmax probabilities for an instance from ImageNet are arranged in descending order. (b) Set size on various models. We use "w/ value" and "w/o value" to represent the vanilla APS and APS without probability values, respectively. The numbers in brackets represent the accuracy of the model. The sizes of the prediction sets are decreased after removing the probability value.
Figure 2: (a) ESCV with different models on ImageNet with $\alpha=0.1$. A good conformal prediction algorithm should keep the y-axis (e.g., ESCV) small. The results show that SAPS outperforms RAPS on most cases. (b) Set size on ImageNet-V2 at $\alpha=0.1$. (c) Set size of various difficulties for multiple models on ImageNet. Small sets are required for easy examples, while hard ones require large sets.
Figure 3: (a) The average ground-truth label ranking under different maximum softmax probabilities. Higher $\hat{\pi}_{max}$ have smaller label ranking. (b) Effect of the $\lambda$ on set size across various models. The black markers ($\bigstar,\blacklozenge,\blacktriangle,\bullet$) represent the results of APS without probability value. (c) Effect of the calibration dataset size on set size across various models. (d) Relationship between temperature and the set size of SAPS on ResNet152, where the horizon axis represents the log transformation of temperature $T$.
Figure 4: ESCV for different models on CIFAR-10 and CIFAR-100 with $\alpha=0.1$.

Theorems & Definitions (6)

Theorem 2.1
Lemma 3.1
Proposition 3.2
Theorem 3.3
proof
proof

Conformal Prediction for Deep Classifier via Label Ranking

TL;DR

Abstract

Conformal Prediction for Deep Classifier via Label Ranking

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (6)