Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification
Kaizheng Wang, Fabio Cuzzolin, Keivan Shariatmadar, David Moens, Hans Hallez
TL;DR
The paper introduces a credal wrapper to improve uncertainty estimation in classification by converting a limited set of single predictive distributions from Bayesian neural networks (BNNs) or deep ensembles (DEs) into per-class probability intervals, forming a credal set. A single intersection probability $p^*$, derived via $p^*_k = p_{L_k} + \alpha(p_{U_k}-p_{L_k})$ with $\alpha = (1-\sum_k p_{L_k})/(\sum_k (p_{U_k}-p_{L_k}))$, maps the credal set back to a definitive prediction, enabling robust out-of-distribution detection. The framework uses probability intervals to quantify epistemic uncertainty and employs generalized entropy to compute upper/lower uncertainty measures, with a practical Probability Interval Approximation (PIA) to reduce computation for high-class problems. Empirical results across multiple datasets and architectures show that the credal wrapper yields improved uncertainty quantification and calibration (lower ECE) on corrupted data and enhanced OOD detection relative to standard BNN/DE baselines and evidential methods. Overall, the approach provides a principled, plug-and-play method to better capture epistemic uncertainty in classification tasks when only a limited ensemble of predictive distributions is available, albeit with higher computational requirements.
Abstract
This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles (DEs), capable of improving uncertainty estimation in classification tasks. Given a finite collection of single predictive distributions derived from BNNs or DEs, the proposed credal wrapper approach extracts an upper and a lower probability bound per class, acknowledging the epistemic uncertainty due to the availability of a limited amount of distributions. Such probability intervals over classes can be mapped on a convex set of probabilities (a credal set) from which, in turn, a unique prediction can be obtained using a transformation called intersection probability transformation. In this article, we conduct extensive experiments on several out-of-distribution (OOD) detection benchmarks, encompassing various dataset pairs (CIFAR10/100 vs SVHN/Tiny-ImageNet, CIFAR10 vs CIFAR10-C, CIFAR100 vs CIFAR100-C and ImageNet vs ImageNet-O) and using different network architectures (such as VGG16, ResNet-18/50, EfficientNet B2, and ViT Base). Compared to the BNN and DE baselines, the proposed credal wrapper method exhibits superior performance in uncertainty estimation and achieves a lower expected calibration error on corrupted data.
