Random-Set Neural Networks (RS-NN)
Shireen Kudukkil Manchingal, Muhammad Mubashar, Kaizheng Wang, Keivan Shariatmadar, Fabio Cuzzolin
TL;DR
The paper tackles the challenge of reliable uncertainty estimation in classification, especially under distribution shift and OoD conditions. It introduces Random-Set Neural Networks (RS-NN), which predict belief functions over a budgeted collection of class subsets via random-set theory, producing pignistic probabilities and credal-set based uncertainty. A BCE-based loss with mass regularization enforces valid belief functions, and a budgeting procedure selects a compact, informative set of focal sets using class-ellipsoid overlaps. Empirically, RS-NN outperforms state-of-the-art Bayesian and ensemble methods on accuracy, OoD detection, and calibration while scaling to large architectures and even enabling conformal prediction guarantees. This approach offers a principled, scalable framework for uncertainty quantification in safety-critical AI applications.
Abstract
Machine learning is increasingly deployed in safety-critical domains where erroneous predictions may lead to potentially catastrophic consequences, highlighting the need for learning systems to be aware of how confident they are in their own predictions: in other words, 'to know when they do not know'. In this paper, we propose a novel Random-Set Neural Network (RS-NN) approach to classification which predicts belief functions (rather than classical probability vectors) over the class list using the mathematics of random sets, i.e., distributions over the collection of sets of classes. RS-NN encodes the 'epistemic' uncertainty induced by training sets that are insufficiently representative or limited in size via the size of the convex set of probability vectors associated with a predicted belief function. Our approach outperforms state-of-the-art Bayesian and Ensemble methods in terms of accuracy, uncertainty estimation and out-of-distribution (OoD) detection on multiple benchmarks (CIFAR-10 vs SVHN/Intel-Image, MNIST vs FMNIST/KMNIST, ImageNet vs ImageNet-O). RS-NN also scales up effectively to large-scale architectures (e.g. WideResNet-28-10, VGG16, Inception V3, EfficientNetB2 and ViT-Base-16), exhibits remarkable robustness to adversarial attacks and can provide statistical guarantees in a conformal learning setting.
