Label-wise Aleatoric and Epistemic Uncertainty Quantification
Yusuf Sale, Paul Hofman, Timo Löhr, Lisa Wimmer, Thomas Nagler, Eyke Hüllermeier
TL;DR
The paper addresses the challenge of quantifying predictive uncertainty in multiclass classification by introducing a label-wise decomposition based on second-order distributions $Q \in \Delta_K^{(2)}$, separating total, aleatoric, and epistemic uncertainty at the class level. It develops both entropy-based and variance-based per-label formulations, with a loss-based framework that can instantiate $\phi$ as log-loss (yielding entropy) or squared-error (yielding variance), and aggregates these to global measures. The authors establish a comprehensive axiomatic foundation for second-order uncertainty (A0–A7, plus A6–A7) and demonstrate that both entropy- and variance-based measures satisfy many of these properties, with variance-based measures offering advantages such as addressing A5 and providing a natural fit for binary label-wise decisions. Empirically, the approach is validated on medical imaging data (PET/CT) and standard benchmarks, showing meaningful per-class uncertainty insights and competitive performance in accuracy-rejection and OoD detection tasks, while maintaining coherence with global uncertainty assessments. The work thus enables cost-sensitive decision-making, improved interpretability, and robust uncertainty quantification in safety-critical applications, with code available for reproducibility.
Abstract
We present a novel approach to uncertainty quantification in classification tasks based on label-wise decomposition of uncertainty measures. This label-wise perspective allows uncertainty to be quantified at the individual class level, thereby improving cost-sensitive decision-making and helping understand the sources of uncertainty. Furthermore, it allows to define total, aleatoric, and epistemic uncertainty on the basis of non-categorical measures such as variance, going beyond common entropy-based measures. In particular, variance-based measures address some of the limitations associated with established methods that have recently been discussed in the literature. We show that our proposed measures adhere to a number of desirable properties. Through empirical evaluation on a variety of benchmark data sets -- including applications in the medical domain where accurate uncertainty quantification is crucial -- we establish the effectiveness of label-wise uncertainty quantification.
