A Rate-Distortion View of Uncertainty Quantification
Ifigeneia Apostolopoulou, Benjamin Eysenbach, Frank Nielsen, Artur Dubrawski
TL;DR
The paper addresses reliable uncertainty quantification for deep neural networks by making predictions distance-aware relative to the training data manifold. It introduces Distance Aware Bottleneck (DAB), a single-model, deterministic approach that learns a codebook of encoder distributions and uses their distance to quantify uncertainty in a rate-distortionIB framework. By replacing the IB complexity term with a finite-codebook rate-distortion objective and employing alternating minimization, DAB achieves superior out-of-distribution detection and misclassification calibration, often surpassing ensembles with far lower computational cost. The method supports post-hoc deployment on pre-trained feature extractors and demonstrates strong results across synthetic tasks, CIFAR-10, and ImageNet-1K, highlighting practical impact for scalable, calibrated uncertainty estimation in real-world applications.
Abstract
In supervised learning, understanding an input's proximity to the training data can help a model decide whether it has sufficient evidence for reaching a reliable prediction. While powerful probabilistic models such as Gaussian Processes naturally have this property, deep neural networks often lack it. In this paper, we introduce Distance Aware Bottleneck (DAB), i.e., a new method for enriching deep neural networks with this property. Building on prior information bottleneck approaches, our method learns a codebook that stores a compressed representation of all inputs seen during training. The distance of a new example from this codebook can serve as an uncertainty estimate for the example. The resulting model is simple to train and provides deterministic uncertainty estimates by a single forward pass. Finally, our method achieves better out-of-distribution (OOD) detection and misclassification prediction than prior methods, including expensive ensemble methods, deep kernel Gaussian Processes, and approaches based on the standard information bottleneck.
