Table of Contents
Fetching ...

Efficient Credal Prediction through Decalibration

Paul Hofman, Timo Löhr, Maximilian Muschalik, Yusuf Sale, Eyke Hüllermeier

TL;DR

This work proposes an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers, and demonstrates credal prediction on models such as TabPFN and CLIP -- architectures for which the construction of credal sets was previously infeasible.

Abstract

A reliable representation of uncertainty is essential for the application of modern machine learning methods in safety-critical settings. In this regard, the use of credal sets (i.e., convex sets of probability distributions) has recently been proposed as a suitable approach to representing epistemic uncertainty. However, as with other approaches to epistemic uncertainty, training credal predictors is computationally complex and usually involves (re-)training an ensemble of models. The resulting computational complexity prevents their adoption for complex models such as foundation models and multi-modal systems. To address this problem, we propose an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers. For each class label, our method predicts a range of plausible probabilities in the form of an interval. To produce the lower and upper bounds of these intervals, we propose a technique that we refer to as decalibration. Extensive experiments show that our method yields credal sets with strong performance across diverse tasks, including coverage-efficiency evaluation, out-of-distribution detection, and in-context learning. Notably, we demonstrate credal prediction on models such as TabPFN and CLIP -- architectures for which the construction of credal sets was previously infeasible.

Efficient Credal Prediction through Decalibration

TL;DR

This work proposes an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers, and demonstrates credal prediction on models such as TabPFN and CLIP -- architectures for which the construction of credal sets was previously infeasible.

Abstract

A reliable representation of uncertainty is essential for the application of modern machine learning methods in safety-critical settings. In this regard, the use of credal sets (i.e., convex sets of probability distributions) has recently been proposed as a suitable approach to representing epistemic uncertainty. However, as with other approaches to epistemic uncertainty, training credal predictors is computationally complex and usually involves (re-)training an ensemble of models. The resulting computational complexity prevents their adoption for complex models such as foundation models and multi-modal systems. To address this problem, we propose an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers. For each class label, our method predicts a range of plausible probabilities in the form of an interval. To produce the lower and upper bounds of these intervals, we propose a technique that we refer to as decalibration. Extensive experiments show that our method yields credal sets with strong performance across diverse tasks, including coverage-efficiency evaluation, out-of-distribution detection, and in-context learning. Notably, we demonstrate credal prediction on models such as TabPFN and CLIP -- architectures for which the construction of credal sets was previously infeasible.
Paper Structure (58 sections, 3 theorems, 23 equations, 15 figures, 8 tables)

This paper contains 58 sections, 3 theorems, 23 equations, 15 figures, 8 tables.

Key Result

Proposition 2.1

If $0<\alpha_2\le \alpha_1\le 1$, then $\mathcal{C}_{\alpha_1}\subseteq \mathcal{C}_{\alpha_2}$ and $\mathcal{Q}_{\bm{x},\alpha_1}\subseteq \mathcal{Q}_{\bm{x},\alpha_2}$. Thus, for all $k$, If a maximum-likelihood estimator $h^{\mathrm{ML}}\in\mathcal{H}$ exists, then $\mathcal{Q}_{\bm{x},1}=\{p_k(\bm{x}, h^{\mathrm{ML}})\}$ and $[\underline p_k(\bm{x};1),\overline p_k(\bm{x};1)]=\{p_k(\bm{x}, h

Figures (15)

  • Figure 1: Overview of Efficient Credal Prediction through Decalibration. Given a probabilistic classifier (maximum likelihood estimate), our method decalibrates the predicted distributions by their logits. The resulting credal set contains the ground-truth distribution, as visualized in the credal spider plot (see \ref{['app:guide-on-visualization']} for an explanation). Note that we only show the decalibration of three classes for visualization purposes---in practice, all classes are decalibrated.
  • Figure 2: Coverage versus Efficiency. Comparison on cifar-10 and chaosnli. The plot highlights the Pareto trade-off: higher coverage often requires lower efficiency. EffCre consistently advances the Pareto front over baselines.
  • Figure 3: Out-of-Distribution Detection. Performance (AUROC, based on epistemic uncertainty) as a function of required number of models and training time (in hours).
  • Figure 4: EffCre used with TabPFN.Top: Coverage versus efficiency performance all multi-class tabarena datasets. Bottom: Active In-Context Learning performance versus the random baseline.
  • Figure 4: Training and inference time in seconds for models trained on CIFAR10. Mean with standard deviation over three runs. Computed based on ensembles with 10 members.
  • ...and 10 more figures

Theorems & Definitions (6)

  • Proposition 2.1
  • Proposition 3.1
  • Corollary 3.1
  • proof : Proof of Proposition \ref{['prop:nestedness']}
  • proof : Proof of Proposition \ref{['prop:credal-endpoints-multi']}
  • proof : Proof of Corollary \ref{['cor:credal-endpoints-uni']}