Logit Disagreement: OoD Detection with Bayesian Neural Networks
Kevin Raina
TL;DR
This paper tackles out-of-distribution detection for Bayesian neural networks by disentangling epistemic uncertainty from aleatoric noise. It introduces logit-based disagreement scores as a simple, post-hoc proxy for epistemic uncertainty, including a disagreement score, weight entropy, and standard deviation of log-logits, with a logit proxy that uses truncated maximum logits. Across MNIST-family and CIFAR-10 experiments, these logit-based scores consistently outperform mutual information and match or exceed predictive entropy in OoD detection while remaining model-agnostic and easy to apply post-training. The work demonstrates that focusing on pre-softmax logits can yield strong uncertainty signals and encourages future exploration of logits and other posterior-inference methods for robust OoD detection in safety-critical settings. The proposed approach offers practical benefits by enabling effective OoD signaling without additional training or architectural changes.
Abstract
Bayesian neural networks (BNNs), which estimate the full posterior distribution over model parameters, are well-known for their role in uncertainty quantification and its promising application in out-of-distribution detection (OoD). Amongst other uncertainty measures, BNNs provide a state-of-the art estimation of predictive entropy (total uncertainty) which can be decomposed as the sum of mutual information and expected entropy. In the context of OoD detection the estimation of predictive uncertainty in the form of the predictive entropy score confounds aleatoric and epistemic uncertainty, the latter being hypothesized to be high for OoD points. Despite these justifications, the mutual information score has been shown to perform worse than predictive entropy. Taking inspiration from Bayesian variational autoencoder (BVAE) literature, this work proposes to measure the disagreement between a corrected version of the pre-softmax quantities, otherwise known as logits, as an estimate of epistemic uncertainty for Bayesian NNs under mean field variational inference. The three proposed epistemic uncertainty scores demonstrate marked improvements over mutual information on a range of OoD experiments, with equal performance otherwise. Moreover, the epistemic uncertainty scores perform on par with the Bayesian benchmark predictive entropy on a range of MNIST and CIFAR10 experiments.
