Table of Contents
Fetching ...

Credal and Interval Deep Evidential Classifications

Michele Caprio, Shireen K. Manchingal, Fabio Cuzzolin

TL;DR

The paper tackles uncertainty quantification in classification by introducing Credal and Interval Deep Evidential Classifications (CDEC and IDEC), which model epistemic and aleatoric uncertainty via credal sets and intervals, respectively. These methods enable abstention when total uncertainty is high and provide set-valued predictions with probabilistic guarantees. Built on Dirichlet-Categorical foundations and normalizing-flow posteriors, CDEC constructs predictive credal sets from multiple encoders/flows, while IDEC uses an interval inflation approach with a single posterior; both yield strong out-of-distribution detection and calibrated predictions across MNIST and CIFAR benchmarks. Extensive experiments reveal competitive predictive performance, sharp uncertainty decompositions, and robust IHDR behavior, with a small ensemble sufficing for stable uncertainty estimates in CDEC. The work advances practical, interpretable uncertainty quantification for deep classifiers under distribution shift.

Abstract

Uncertainty Quantification (UQ) presents a pivotal challenge in the field of Artificial Intelligence (AI), profoundly impacting decision-making, risk assessment and model reliability. In this paper, we introduce Credal and Interval Deep Evidential Classifications (CDEC and IDEC, respectively) as novel approaches to address UQ in classification tasks. CDEC and IDEC leverage a credal set (closed and convex set of probabilities) and an interval of evidential predictive distributions, respectively, allowing us to avoid overfitting to the training data and to systematically assess both epistemic (reducible) and aleatoric (irreducible) uncertainties. When those surpass acceptable thresholds, CDEC and IDEC have the capability to abstain from classification and flag an excess of epistemic or aleatoric uncertainty, as relevant. Conversely, within acceptable uncertainty bounds, CDEC and IDEC provide a collection of labels with robust probabilistic guarantees. CDEC and IDEC are trained using standard backpropagation and a loss function that draws from the theory of evidence. They overcome the shortcomings of previous efforts, and extend the current evidential deep learning literature. Through extensive experiments on MNIST, CIFAR-10 and CIFAR-100, together with their natural OoD shifts (F-MNIST/K-MNIST, SVHN/Intel, TinyImageNet), we show that CDEC and IDEC achieve competitive predictive accuracy, state-of-the-art OoD detection under epistemic and total uncertainty, and tight, well-calibrated prediction regions that expand reliably under distribution shift. An ablation over ensemble size further demonstrates that CDEC attains stable uncertainty estimates with only a small ensemble.

Credal and Interval Deep Evidential Classifications

TL;DR

The paper tackles uncertainty quantification in classification by introducing Credal and Interval Deep Evidential Classifications (CDEC and IDEC), which model epistemic and aleatoric uncertainty via credal sets and intervals, respectively. These methods enable abstention when total uncertainty is high and provide set-valued predictions with probabilistic guarantees. Built on Dirichlet-Categorical foundations and normalizing-flow posteriors, CDEC constructs predictive credal sets from multiple encoders/flows, while IDEC uses an interval inflation approach with a single posterior; both yield strong out-of-distribution detection and calibrated predictions across MNIST and CIFAR benchmarks. Extensive experiments reveal competitive predictive performance, sharp uncertainty decompositions, and robust IHDR behavior, with a small ensemble sufficing for stable uncertainty estimates in CDEC. The work advances practical, interpretable uncertainty quantification for deep classifiers under distribution shift.

Abstract

Uncertainty Quantification (UQ) presents a pivotal challenge in the field of Artificial Intelligence (AI), profoundly impacting decision-making, risk assessment and model reliability. In this paper, we introduce Credal and Interval Deep Evidential Classifications (CDEC and IDEC, respectively) as novel approaches to address UQ in classification tasks. CDEC and IDEC leverage a credal set (closed and convex set of probabilities) and an interval of evidential predictive distributions, respectively, allowing us to avoid overfitting to the training data and to systematically assess both epistemic (reducible) and aleatoric (irreducible) uncertainties. When those surpass acceptable thresholds, CDEC and IDEC have the capability to abstain from classification and flag an excess of epistemic or aleatoric uncertainty, as relevant. Conversely, within acceptable uncertainty bounds, CDEC and IDEC provide a collection of labels with robust probabilistic guarantees. CDEC and IDEC are trained using standard backpropagation and a loss function that draws from the theory of evidence. They overcome the shortcomings of previous efforts, and extend the current evidential deep learning literature. Through extensive experiments on MNIST, CIFAR-10 and CIFAR-100, together with their natural OoD shifts (F-MNIST/K-MNIST, SVHN/Intel, TinyImageNet), we show that CDEC and IDEC achieve competitive predictive accuracy, state-of-the-art OoD detection under epistemic and total uncertainty, and tight, well-calibrated prediction regions that expand reliably under distribution shift. An ablation over ensemble size further demonstrates that CDEC attains stable uncertainty estimates with only a small ensemble.

Paper Structure

This paper contains 41 sections, 7 theorems, 56 equations, 8 figures, 10 tables, 4 algorithms.

Key Result

Theorem 3.2

Suppose, without loss of generality, that $\text{ex}\mathcal{P}_\text{pred} = \{\text{Cat}(\pi^\prime_s)\}_{s=1}^S$. Let $\Delta^{S-1}$ denote the $S$-dimensional unit simplex, Let $\underline{H}(P^\text{ex})\coloneqq\min_{P^\text{ex} \in \text{ex}\mathcal{P}_\text{pred} }H(P^\text{ex})$. Then,

Figures (8)

  • Figure 1: iD vs OoD IHDR set size across datasets. Bars show the mean IHDR cardinality for CDEC-3 and IDEC on in-distribution data (blue) and on one or two OoD datasets per benchmark (orange/salmon). CDEC-3 exhibits a clear expansion of the IHDR under distribution shift, whereas IDEC shows a smaller and less consistent separation between iD and OoD.
  • Figure 2: Effect of ensemble size $S$ on IHDR distributions and coverage for CDEC. IHDR sizes (violin + median) and coverage are shown for MNIST, CIFAR-10, and CIFAR-100. Ensemble sizes $S \geq 3$ yield stable IHDR behavior across datasets, whereas $S=1$ underestimates epistemic uncertainty on CIFAR-10 and CIFAR-100.
  • Figure 3: CIFAR-10 uncertainty distributions for PostNet, PostNet-3, CDEC-3 and IDEC.
  • Figure 4: MNIST uncertainty distributions for PostNet, PostNet-3, CDEC and IDEC.
  • Figure 5: CIFAR-100 uncertainty distributions for PostNet, PostNet-3, CDEC and IDEC.
  • ...and 3 more figures

Theorems & Definitions (19)

  • Remark 1
  • Definition 3.1: Credal Set levi2
  • Theorem 3.2: Total and Aleatoric Uncertainties
  • Definition 4.1: Lower and Upper Probabilities
  • Definition 4.2: Imprecise Highest Density Region coolen
  • Theorem 4.3: Lower Bound for the Lower Probability of an IHDR
  • Remark 2
  • Definition 5.1: Interval of Measures
  • Example
  • Proposition 5.2: Linking Intervals of Measures with Credal Sets
  • ...and 9 more