Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset
Mijoo Kim, Junseok Kwon
TL;DR
This work tackles uncertainty calibration for deep neural networks under distribution shifts, including in-distribution and out-of-distribution data. It introduces an energy based instance wise post hoc calibration method that uses a per input scaling factor derived from energy scores computed from the network logits, and trains parameters to adapt calibration to each sample. By leveraging free energy and per-sample energy distributions for correct and incorrect predictions, the approach achieves robust calibration across ID, covariate shift, and semantic shift, and demonstrates superior or competitive performance against established baselines and DAC. The method enhances reliable uncertainty estimates for safety critical applications and offers a practical, lightweight post hoc calibration option that generalizes across architectures and datasets. Overall, the paper presents a principled energy based framework that improves the trustworthiness of DNN predictions in the wild and provides strong empirical validation across multiple benchmarks.
Abstract
With the rapid advancement in the performance of deep neural networks (DNNs), there has been significant interest in deploying and incorporating artificial intelligence (AI) systems into real-world scenarios. However, many DNNs lack the ability to represent uncertainty, often exhibiting excessive confidence even when making incorrect predictions. To ensure the reliability of AI systems, particularly in safety-critical cases, DNNs should transparently reflect the uncertainty in their predictions. In this paper, we investigate robust post-hoc uncertainty calibration methods for DNNs within the context of multi-class classification tasks. While previous studies have made notable progress, they still face challenges in achieving robust calibration, particularly in scenarios involving out-of-distribution (OOD). We identify that previous methods lack adaptability to individual input data and struggle to accurately estimate uncertainty when processing inputs drawn from the wild dataset. To address this issue, we introduce a novel instance-wise calibration method based on an energy model. Our method incorporates energy scores instead of softmax confidence scores, allowing for adaptive consideration of DNN uncertainty for each prediction within a logit space. In experiments, we show that the proposed method consistently maintains robust performance across the spectrum, spanning from in-distribution to OOD scenarios, when compared to other state-of-the-art methods.
