Towards Unbiased Calibration using Meta-Regularization

Cheng Wang; Jacek Golebiowski

Towards Unbiased Calibration using Meta-Regularization

Cheng Wang, Jacek Golebiowski

TL;DR

It is empirically demonstrated that learning sample-wise gamma as continuous variables can effectively improve calibration and the combination of gamma-net and SECE achieves the best calibration performance across various calibration metrics while retaining very competitive predictive performance as compared to multiple recently proposed methods.

Abstract

Model miscalibration has been frequently identified in modern deep neural networks. Recent work aims to improve model calibration directly through a differentiable calibration proxy. However, the calibration produced is often biased due to the binning mechanism. In this work, we propose to learn better-calibrated models via meta-regularization, which has two components: (1) gamma network (gamma-net), a meta learner that outputs sample-wise gamma values (continuous variable) for Focal loss for regularizing the backbone network; (2) smooth expected calibration error (SECE), a Gaussian-kernel based, unbiased, and differentiable surrogate to ECE that enables the smooth optimization of gamma-Net. We evaluate the effectiveness of the proposed approach in regularizing neural networks towards improved and unbiased calibration on three computer vision datasets. We empirically demonstrate that: (a) learning sample-wise gamma as continuous variables can effectively improve calibration; (b) SECE smoothly optimizes gamma-net towards unbiased and robust calibration with respect to the binning schemes; and (c) the combination of gamma-net and SECE achieves the best calibration performance across various calibration metrics while retaining very competitive predictive performance as compared to multiple recently proposed methods.

Towards Unbiased Calibration using Meta-Regularization

TL;DR

Abstract

Paper Structure (25 sections, 11 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 11 equations, 10 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Model Calibration
Focal Loss
Meta-Learning
Methods
$\gamma$-Net: Learning Sample-Wise Gamma for Focal Loss
SECE : Smooth Expected Calibration Error
Optimising $\gamma$-Net with SECE
Experiments
Predictive and Calibration Performance
Ablation Study
Learning $\gamma$ as continous variables
Calibration bias and robustness
...and 10 more sections

Figures (10)

Figure 1: Our proposed approach for regularizing the base network towards better calibration includes two new components: $\gamma$-Net and SECE . The inner loop optimizes the backbone network (e.g., ResNet), which uses focal loss as an objective function. The $\gamma$-Net in the outer loop takes the extracted second-to-last layer representation of backbone network as input and learns to output sample-wise $\gamma$ for focal loss in a continuous space. The $\gamma$-Net is optimized by using the proposed SECE, a Gaussian kernel-based, unbiased, and differentiable calibration error.
Figure 2: The reliability diagram plot for models on CIFAR-100 test set. The ($\cdot$) represents test error. The diagonal dash line represents perfect calibration. The red bar represents the gap between the observed accuracy and the desired accuracy of the perfectly calibrated model (the diagonal) - it is positive if the observed accuracy is lower and negative otherwise. The model from the 5$^{th}$ run is used.
Figure 3: (a-b): ECE curves on the test dataset of CIFAR-10 (a) and CIFAR-100 (b).(c-d): The mean and standard deviation (std.) of $\gamma$ on test dataset at each epoch. Low std. score indicates samples share similar gamma values, and high std. score indicates more samples have different $\gamma$ values.
Figure 4: The reliability diagram plots on CIFAR-100 with large bin numbers (top to bottom: 20, 50, 100).
Figure 5: The changes in ECE (left) and MCE (right) scores on the CIFAR-10 test dataset with increasing bin numbers in the range of [10, 20, 50, 100, 200, 500, 1000] are illustrated. FL$_{\gamma}$-SECE demonstrates superior robustness to increasing bin numbers, as evidenced by lower MCE. A similar plot for CIFAR-100 is provided in the Figure \ref{['fig:cifar100_n_bins']} of the Appendix.
...and 5 more figures

Towards Unbiased Calibration using Meta-Regularization

TL;DR

Abstract

Towards Unbiased Calibration using Meta-Regularization

Authors

TL;DR

Abstract

Table of Contents

Figures (10)