Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu
TL;DR
This work questions the assumption that confidence calibration and OOD detection naturally improve failure prediction in neural classifiers. By showing that many calibration and OOD detection methods impair the separation between correct and misclassified predictions, it reframes failure prediction as a Bayes-like decision problem and links it to flatness of the loss landscape. The authors introduce FMFP, a simple, plug-and-play approach that combines SWA and SAM to realize flat minima, supported by PAC-Bayes theory and strong empirical gains across balanced, long-tailed, and covariate-shift regimes, as well as improved OOD detection. The results establish a robust, unified baseline for reliable confidence estimation and illuminate the connections between calibration, OOD detection, and failure prediction with practical implications for safety-critical AI.
Abstract
Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been developed. In this paper, we find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We investigate this problem and reveal that popular calibration and OOD detection methods often lead to worse confidence separation between correctly classified and misclassified examples, making it difficult to decide whether to trust a prediction or not. Finally, we propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance under various settings including balanced, long-tailed, and covariate-shift classification scenarios. Our study not only provides a strong baseline for reliable confidence estimation but also acts as a bridge between understanding calibration, OOD detection, and failure prediction. The code is available at \url{https://github.com/Impression2805/FMFP}.
