Random-Set Neural Networks (RS-NN)

Shireen Kudukkil Manchingal; Muhammad Mubashar; Kaizheng Wang; Keivan Shariatmadar; Fabio Cuzzolin

Random-Set Neural Networks (RS-NN)

Shireen Kudukkil Manchingal, Muhammad Mubashar, Kaizheng Wang, Keivan Shariatmadar, Fabio Cuzzolin

TL;DR

The paper tackles the challenge of reliable uncertainty estimation in classification, especially under distribution shift and OoD conditions. It introduces Random-Set Neural Networks (RS-NN), which predict belief functions over a budgeted collection of class subsets via random-set theory, producing pignistic probabilities and credal-set based uncertainty. A BCE-based loss with mass regularization enforces valid belief functions, and a budgeting procedure selects a compact, informative set of focal sets using class-ellipsoid overlaps. Empirically, RS-NN outperforms state-of-the-art Bayesian and ensemble methods on accuracy, OoD detection, and calibration while scaling to large architectures and even enabling conformal prediction guarantees. This approach offers a principled, scalable framework for uncertainty quantification in safety-critical AI applications.

Abstract

Machine learning is increasingly deployed in safety-critical domains where erroneous predictions may lead to potentially catastrophic consequences, highlighting the need for learning systems to be aware of how confident they are in their own predictions: in other words, 'to know when they do not know'. In this paper, we propose a novel Random-Set Neural Network (RS-NN) approach to classification which predicts belief functions (rather than classical probability vectors) over the class list using the mathematics of random sets, i.e., distributions over the collection of sets of classes. RS-NN encodes the 'epistemic' uncertainty induced by training sets that are insufficiently representative or limited in size via the size of the convex set of probability vectors associated with a predicted belief function. Our approach outperforms state-of-the-art Bayesian and Ensemble methods in terms of accuracy, uncertainty estimation and out-of-distribution (OoD) detection on multiple benchmarks (CIFAR-10 vs SVHN/Intel-Image, MNIST vs FMNIST/KMNIST, ImageNet vs ImageNet-O). RS-NN also scales up effectively to large-scale architectures (e.g. WideResNet-28-10, VGG16, Inception V3, EfficientNetB2 and ViT-Base-16), exhibits remarkable robustness to adversarial attacks and can provide statistical guarantees in a conformal learning setting.

Random-Set Neural Networks (RS-NN)

TL;DR

Abstract

Paper Structure (38 sections, 17 equations, 28 figures, 19 tables, 3 algorithms)

This paper contains 38 sections, 17 equations, 28 figures, 19 tables, 3 algorithms.

Introduction
Random sets and belief functions
Random-Set Neural Network
Approach
Loss function
Accuracy and uncertainty estimation
Experiments
Implementation
Comparison with the state-of-the-art on accuracy
Out-of-distribution (OoD) detection
Uncertainty estimation
Scalability to large-scale architectures
Limitations
Conclusion
Theoretical explanation of RS-NN
...and 23 more sections

Figures (28)

Figure 1: Inference in a Bayesian Neural Network (top) as opposed to a Random-Set Neural Network (bottom), with corresponding measures of uncertainty and their sources. The triangle represents the set of probability vectors (probability simplex) one can define on the target space (e.g., a set of 3 classes).
Figure 2: Confidence scores of RS-NN and CNN for FGSM adversarial attack for different perturbations ($\epsilon$ = 0.05, 0.01) on MNIST dataset.
Figure 3: The random set associated with a cloaked die in which faces 1 and 2 are not visible.
Figure 4: A belief function is equivalent to a credal set with boundaries determined by lower bounds (Eq. \ref{['eq:consistent']}) on probability values.
Figure 6: Credal set width for RS-NN on iD vs OoD datasets: CIFAR10 vs SVHN/Intel Image, MNIST vs F-MNIST/K-MNIST and ImageNet vs ImageNet-O.
...and 23 more figures

Random-Set Neural Networks (RS-NN)

TL;DR

Abstract

Random-Set Neural Networks (RS-NN)

Authors

TL;DR

Abstract

Table of Contents

Figures (28)