Does confidence calibration improve conformal prediction?
Huajun Xi, Jianguo Huang, Kangdao Liu, Lei Feng, Hongxin Wei
TL;DR
This work questions the conventional use of confidence calibration to improve conformal prediction. It shows empirically that post-hoc calibration often enlarges adaptive conformal prediction sets, while inducing high-confidence predictions via smaller temperatures can improve efficiency, though extreme low temperatures face numerical issues. The authors provide a theoretical link between temperature and non-conformity scores, and introduce ConfTS, a loss-driven variant of temperature scaling that optimizes for prediction-set efficiency and generalizes to other post-hoc calibrators. Across image and text tasks, including large language models, ConfTS yields substantial efficiency gains without compromising marginal coverage, highlighting a practical path to more trustworthy uncertainty quantification.
Abstract
Conformal prediction is an emerging technique for uncertainty quantification that constructs prediction sets guaranteed to contain the true label with a predefined probability. Previous works often employ temperature scaling to calibrate classifiers, assuming that confidence calibration benefits conformal prediction. However, the specific impact of confidence calibration on conformal prediction remains underexplored. In this work, we make two key discoveries about the impact of confidence calibration methods on adaptive conformal prediction. Firstly, we empirically show that current confidence calibration methods (e.g., temperature scaling) typically lead to larger prediction sets in adaptive conformal prediction. Secondly, by investigating the role of temperature value, we observe that high-confidence predictions can enhance the efficiency of adaptive conformal prediction. Theoretically, we prove that predictions with higher confidence result in smaller prediction sets on expectation. This finding implies that the rescaling parameters in these calibration methods, when optimized with cross-entropy loss, might counteract the goal of generating efficient prediction sets. To address this issue, we propose Conformal Temperature Scaling (ConfTS), a variant of temperature scaling with a novel loss function designed to enhance the efficiency of prediction sets. This approach can be extended to optimize the parameters of other post-hoc methods of confidence calibration. Extensive experiments demonstrate that our method improves existing adaptive conformal prediction methods in classification tasks, especially with LLMs.
