Table of Contents
Fetching ...

Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration

Jiani Ni, He Zhao, Jintong Gao, Dandan Guo, Hongyuan Zha

TL;DR

BalCAL rethinks calibration by balancing a standard learnable classifier with a fixed Simplex ETF classifier derived from Neural Collapse. A confidence-tunable module and a dynamic adjustment mechanism regulate the ETF’s influence, allowing the model to combat both overconfidence and underconfidence without sacrificing accuracy. The approach demonstrates superior calibration (lower ECE/AECE) and improved robustness under distribution shifts and OOD scenarios across CIFAR-10/100, SVHN, and Tiny-ImageNet, with consistent gains when integrated with existing calibration methods. By exploiting the ETF’s scaling properties and a fusion of outputs, BalCAL provides a flexible, deployable calibration regularizer that generalizes across architectures and datasets.

Abstract

In recent years, deep neural networks (DNNs) have demonstrated state-of-the-art performance across various domains. However, despite their success, they often face calibration issues, particularly in safety-critical applications such as autonomous driving and healthcare, where unreliable predictions can have serious consequences. Recent research has started to improve model calibration from the view of the classifier. However, the exploration of designing the classifier to solve the model calibration problem is insufficient. Let alone most of the existing methods ignore the calibration errors arising from underconfidence. In this work, we propose a novel method by balancing learnable and ETF classifiers to solve the overconfidence or underconfidence problem for model Calibration named BalCAL. By introducing a confidence-tunable module and a dynamic adjustment method, we ensure better alignment between model confidence and its true accuracy. Extensive experimental validation shows that ours significantly improves model calibration performance while maintaining high predictive accuracy, outperforming existing techniques. This provides a novel solution to the calibration challenges commonly encountered in deep learning.

Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration

TL;DR

BalCAL rethinks calibration by balancing a standard learnable classifier with a fixed Simplex ETF classifier derived from Neural Collapse. A confidence-tunable module and a dynamic adjustment mechanism regulate the ETF’s influence, allowing the model to combat both overconfidence and underconfidence without sacrificing accuracy. The approach demonstrates superior calibration (lower ECE/AECE) and improved robustness under distribution shifts and OOD scenarios across CIFAR-10/100, SVHN, and Tiny-ImageNet, with consistent gains when integrated with existing calibration methods. By exploiting the ETF’s scaling properties and a fusion of outputs, BalCAL provides a flexible, deployable calibration regularizer that generalizes across architectures and datasets.

Abstract

In recent years, deep neural networks (DNNs) have demonstrated state-of-the-art performance across various domains. However, despite their success, they often face calibration issues, particularly in safety-critical applications such as autonomous driving and healthcare, where unreliable predictions can have serious consequences. Recent research has started to improve model calibration from the view of the classifier. However, the exploration of designing the classifier to solve the model calibration problem is insufficient. Let alone most of the existing methods ignore the calibration errors arising from underconfidence. In this work, we propose a novel method by balancing learnable and ETF classifiers to solve the overconfidence or underconfidence problem for model Calibration named BalCAL. By introducing a confidence-tunable module and a dynamic adjustment method, we ensure better alignment between model confidence and its true accuracy. Extensive experimental validation shows that ours significantly improves model calibration performance while maintaining high predictive accuracy, outperforming existing techniques. This provides a novel solution to the calibration challenges commonly encountered in deep learning.

Paper Structure

This paper contains 47 sections, 1 theorem, 16 equations, 11 figures, 11 tables, 1 algorithm.

Key Result

Theorem 1

In the context of a Simplex ETF initialized as a fixed classifier, the output confidence $\hat{p}_i$ is related to both the scaling factor $\beta$ and the number of classes $K$. Specifically, we have the following relationship: where $\sigma_i \!=\! \bm{z} \hat{\bm{m}}_i$ denotes the score of sample about the $i$-th class and $\hat{\bm{m}}_i$ is the $i$-th columns of the matrix $\mathbf{U} \left(

Figures (11)

  • Figure 1: The motivation of BalCAL. Calibration error arises from the discrepancy between model confidence and actual accuracy, often manifesting as overconfidence (a). After incorporating Mixup, underconfidence may also occur (a). We propose a method to dynamically adjust confidence, addressing both issues simultaneously (b). Calibration performance is shown in (c).
  • Figure 2: An illustration of the BalCAL. (a) The left side is an overview of ours, where the image is input into a shared encoder. The encoded feature is fed into the learnable classifier and the confidence-tunable module, respectively, where the latter aims to adjust its confidence to complement that of the former. A dynamic adjustment mechanism is proposed to balance the confidence between the two components. (b) The right side depicts the confidence-tunable module, consisting of an adapter and a fixed ETF classifier, which works in tandem with the learnable classifier to refine the overall model’s confidence.
  • Figure 3: Impact of different $\beta$ on the output confidence of ETF classifiers on CIFAR-10.
  • Figure 4: Calibration performance under distribution shifts of various methods on CIFAR-10 (left) and CIFAR-100 (right). Here, x-axis denotes the shift levels (the larger the number, the greater the shift degree) and y-axis is the ECE (smaller is better).
  • Figure 5: Calibration sensitivity of Mixup, MIT, and Mixup+Ours with combination ratios $\alpha$ on CIFAR-10.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 1
  • Theorem 1