Transitional Uncertainty with Layered Intermediate Predictions
Ryan Benkert, Mohit Prabhushankar, Ghassan AlRegib
TL;DR
This work tackles the challenge of reliable single-pass uncertainty estimation by examining why output-distance preservation can hinder the learning objective. It introduces Transitional Uncertainty with Layered Intermediate Predictions (TULIP), which employs transitional feature preservation across intermediate representations and a combination head to produce uncertainty scores in one forward pass. The authors provide theoretical insight into when distance preservation helps and demonstrate a practical, scalable method that matches or surpasses existing single-pass estimators on CIFAR, medical CT data, and ImageNet, especially under architectural complexity and class imbalance. The approach offers a pragmatic alternative to ensembles, improving uncertainty estimates without the computational burden of multiple forward passes, and shows promise for deployment in real-world scenarios with diverse data modalities.
Abstract
In this paper, we discuss feature engineering for single-pass uncertainty estimation. For accurate uncertainty estimates, neural networks must extract differences in the feature space that quantify uncertainty. This could be achieved by current single-pass approaches that maintain feature distances between data points as they traverse the network. While initial results are promising, maintaining feature distances within the network representations frequently inhibits information compression and opposes the learning objective. We study this effect theoretically and empirically to arrive at a simple conclusion: preserving feature distances in the output is beneficial when the preserved features contribute to learning the label distribution and act in opposition otherwise. We then propose Transitional Uncertainty with Layered Intermediate Predictions (TULIP) as a simple approach to address the shortcomings of current single-pass estimators. Specifically, we implement feature preservation by extracting features from intermediate representations before information is collapsed by subsequent layers. We refer to the underlying preservation mechanism as transitional feature preservation. We show that TULIP matches or outperforms current single-pass methods on standard benchmarks and in practical settings where these methods are less reliable (imbalances, complex architectures, medical modalities).
