Be Bayesian by Attachments to Catch More Uncertainty
Shiyu Shen, Bin Pan, Tianyang Shi, Tao Li, Zhenwei Shi
TL;DR
Be Bayesian by Attachments to Catch More Uncertainty (ABNN) tackles uncertainty estimation by extending Bayesian neural networks with an attachment structure that captures OOD uncertainty while preserving ID predictive power. The authors formalize ID, semi-OOD, and full-OOD data partitions and connect uncertainty to posterior variance, providing convergence analysis. Training alternates between ID-focused KL minimization and OOD-oriented KL maximization, effectively inflating OOD variance via a lightweight attachment while maintaining backbone performance. Empirical results on MNIST/SVHN/CIFAR demonstrate ABNN’s strong OOD detection and misclassification detection capabilities, with robustness across backbones and insensitivity to the exact choice of the balancing parameter, highlighting practical impact for reliable uncertainty estimation in real-world systems.
Abstract
Bayesian Neural Networks (BNNs) have become one of the promising approaches for uncertainty estimation due to the solid theorical foundations. However, the performance of BNNs is affected by the ability of catching uncertainty. Instead of only seeking the distribution of neural network weights by in-distribution (ID) data, in this paper, we propose a new Bayesian Neural Network with an Attached structure (ABNN) to catch more uncertainty from out-of-distribution (OOD) data. We first construct a mathematical description for the uncertainty of OOD data according to the prior distribution, and then develop an attached Bayesian structure to integrate the uncertainty of OOD data into the backbone network. ABNN is composed of an expectation module and several distribution modules. The expectation module is a backbone deep network which focuses on the original task, and the distribution modules are mini Bayesian structures which serve as attachments of the backbone. In particular, the distribution modules aim at extracting the uncertainty from both ID and OOD data. We further provide theoretical analysis for the convergence of ABNN, and experimentally validate its superiority by comparing with some state-of-the-art uncertainty estimation methods Code will be made available.
