Efficient Post-Hoc Uncertainty Calibration via Variance-Based Smoothing
Fabian Denoodt, José Oramas
TL;DR
This work targets the problem of reliable, real-time uncertainty estimation for pre-trained classifiers, addressing the high computational cost of traditional methods like deep ensembles and MC-dropout. It introduces Variance-based Smoothing, which uses the variance of predictions across informative sub-patches to calibrate confidence by scaling logits with $\tilde{\sigma}$, and, when applicable, extends this mechanism to ensembles with $p(y\mid \mathbf{x}) = \text{Softmax}\left(\frac{ \frac{1}{M}\sum_m p_{\text{logit}~\theta_m}(y\mid \mathbf{x})}{\tilde{\sigma}}\right)$. The method preserves classification accuracy while improving calibration on clean data and under various dataset shifts, with minimal computational overhead compared to ensembles or MC-dropout. Experiments on Radio Signals, LibriSpeech, and CIFAR-10 demonstrate competitive or superior calibration and informative uncertainty distributions, especially in high-class-count settings where traditional ensembles struggle. Overall, Variance-based Smoothing offers a practical, post-hoc calibration tool suitable for real-time deployment and can enhance ensemble expressiveness without significant cost.
Abstract
Since state-of-the-art uncertainty estimation methods are often computationally demanding, we investigate whether incorporating prior information can improve uncertainty estimates in conventional deep neural networks. Our focus is on machine learning tasks where meaningful predictions can be made from sub-parts of the input. For example, in speaker classification, the speech waveform can be divided into sequential patches, each containing information about the same speaker. We observe that the variance between sub-predictions serves as a reliable proxy for uncertainty in such settings. Our proposed variance-based scaling framework produces competitive uncertainty estimates in classification while being less computationally demanding and allowing for integration as a post-hoc calibration tool. This approach also leads to a simple extension of deep ensembles, improving the expressiveness of their predicted distributions.
