Table of Contents
Fetching ...

Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference

Jiayi Huang, Sangwoo Park, Osvaldo Simeone

TL;DR

Numerical results illustrate the trade-offs between ID accuracy, ID calibration, and OOD calibration, showing that the proposed novel Bayesian approach achieves the best ID and OOD performance compared to existing state-of-the-art approaches, at the cost of rejecting a fraction of the inputs.

Abstract

The application of artificial intelligence (AI) models in fields such as engineering is limited by the known difficulty of quantifying the reliability of an AI's decision. A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs. A conventional approach to improve calibration is the application of Bayesian ensembling. However, owing to computational limitations and model misspecification, practical ensembling strategies do not necessarily enhance calibration. This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization. The scheme is constructed successively by first introducing calibration-regularized Bayesian learning (CBNN), then incorporating out-of-distribution confidence minimization (OCM) to yield CBNN-OCM, and finally integrating also selective calibration to produce selective CBNN-OCM (SCBNN-OCM). Selective calibration rejects inputs for which the calibration performance is expected to be insufficient. Numerical results illustrate the trade-offs between ID accuracy, ID calibration, and OOD calibration attained by both frequentist and Bayesian learning methods. Among the main conclusions, SCBNN-OCM is seen to achieve best ID and OOD performance as compared to existing state-of-the-art approaches at the cost of rejecting a sufficiently large number of inputs.

Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference

TL;DR

Numerical results illustrate the trade-offs between ID accuracy, ID calibration, and OOD calibration, showing that the proposed novel Bayesian approach achieves the best ID and OOD performance compared to existing state-of-the-art approaches, at the cost of rejecting a fraction of the inputs.

Abstract

The application of artificial intelligence (AI) models in fields such as engineering is limited by the known difficulty of quantifying the reliability of an AI's decision. A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs. A conventional approach to improve calibration is the application of Bayesian ensembling. However, owing to computational limitations and model misspecification, practical ensembling strategies do not necessarily enhance calibration. This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization. The scheme is constructed successively by first introducing calibration-regularized Bayesian learning (CBNN), then incorporating out-of-distribution confidence minimization (OCM) to yield CBNN-OCM, and finally integrating also selective calibration to produce selective CBNN-OCM (SCBNN-OCM). Selective calibration rejects inputs for which the calibration performance is expected to be insufficient. Numerical results illustrate the trade-offs between ID accuracy, ID calibration, and OOD calibration attained by both frequentist and Bayesian learning methods. Among the main conclusions, SCBNN-OCM is seen to achieve best ID and OOD performance as compared to existing state-of-the-art approaches at the cost of rejecting a sufficiently large number of inputs.
Paper Structure (42 sections, 45 equations, 16 figures, 2 tables)

This paper contains 42 sections, 45 equations, 16 figures, 2 tables.

Figures (16)

  • Figure 1: (a) Standard frequentist neural networks (FNNs) generally fail to provide well-calibrated decisions, and improved in-distribution (ID) calibration can be achieved via Bayesian neural networks (BNNs) huang2023calibration. (b) Calibration regularization improves ID calibration via a regularizer that penalizes calibration errors kumar2018trainable. (c) Out-of-distribution confidence minimization (OCM) injects OOD examples during training to improve OOD detection choi2023conservative. (d) Selective calibration further improves both ID and OOD calibration by only producing decisions for inputs at which uncertainty quantification is deemed to be sufficiently reliable. Prior works kumar2018trainablechoi2023conservativefisch2022calibrated introduced calibration regularization, OCM, and selective calibration as separate methods for FNNs. In contrast, this work presents an integrated training method for BNNs that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization.
  • Figure 2: Reliability diagrams visualize the calibration performance (\ref{['eq:perfect_cal']}) of the model by evaluating the average accuracy over test examples to which the prediction has the same confidence value $r$. Typically, FNNs return over-confident decisions, for which the accuracy is lower than the confidence obtained by the model.
  • Figure 3: Standard FNN training simeone2022machine minimizes the cross-entropy loss $\mathcal{L}(\theta|\mathcal{D}^\text{tr})$; CFNN training kumar2018trainable minimizes the regularized cross-entropy loss (\ref{['CA-FNN']}); and BNN learning optimizes the free-energy loss in (\ref{['free-energy']}) simeone2022machine. The proposed CBNN optimizes a regularized free-energy loss with the aim of capturing epistemic uncertainty, like BNNs, while also accounting directly for calibration performance as CFNNs.
  • Figure 4: Unlike standard FNNs that aim at maximizing the accuracy on the ID training data set, FNN-OCM caters to the OOD detection task by maximizing the uncertainty for an unlabelled data set that contains OOD inputs. As a result, the model tends to assign different confidence levels $r(x) = p(\hat{y}(x)|x, \mathcal{D}^\text{tr})$ to ID and OOD inputs. The proposed CBNN-OCM accounts for both ID calibration and OOD detection by capturing epistemic uncertainty as well as OOD uncertainty.
  • Figure 5: Given a fixed, pre-trained model parameter vector $\theta \sim q(\theta|\varphi^\text{CBNN-OCM})$, selective calibration aims at achieving well-calibrated decisions (\ref{['eq:perfect_cal']}) on the selected inputs (hence aiming at (\ref{['eq:perfect_cal_sel']})) by rejecting inputs on which the discrepancy between confidence and accuracy is expected to be large.
  • ...and 11 more figures