Table of Contents
Fetching ...

Feature Clipping for Uncertainty Calibration

Linwei Tao, Minjing Dong, Chang Xu

TL;DR

As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

Abstract

Deep neural networks (DNNs) have achieved significant success across various tasks, but ensuring reliable uncertainty estimates, known as model calibration, is crucial for their safe and effective deployment. Modern DNNs often suffer from overconfidence, leading to miscalibration. We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue. FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples while maintaining the information in low calibration error samples. This process reduces the overconfidence in predictions, improving the overall calibration of the model. Our extensive experiments on datasets such as CIFAR-10, CIFAR-100, and ImageNet, and models including CNNs and transformers, demonstrate that FC consistently enhances calibration performance. Additionally, we provide a theoretical analysis that validates the effectiveness of our method. As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

Feature Clipping for Uncertainty Calibration

TL;DR

As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

Abstract

Deep neural networks (DNNs) have achieved significant success across various tasks, but ensuring reliable uncertainty estimates, known as model calibration, is crucial for their safe and effective deployment. Modern DNNs often suffer from overconfidence, leading to miscalibration. We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue. FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples while maintaining the information in low calibration error samples. This process reduces the overconfidence in predictions, improving the overall calibration of the model. Our extensive experiments on datasets such as CIFAR-10, CIFAR-100, and ImageNet, and models including CNNs and transformers, demonstrate that FC consistently enhances calibration performance. Additionally, we provide a theoretical analysis that validates the effectiveness of our method. As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

Paper Structure

This paper contains 33 sections, 1 theorem, 43 equations, 5 figures, 8 tables.

Key Result

Theorem 1

High calibration error samples suffer larger entropy difference compared to low calibration error samples after feature clipping.

Figures (5)

  • Figure 1: Feature Difference Between High Calibration Error Samples and Low Calibration Error samples.
  • Figure 2: Average absolute feature value of samples with high or low calibration error on Vision Transformer. We randomly select 50 feature units out of 2048 units. The high/low calibration error samples are selected as the wrongly/correctly predicted samples with confidence larger than 0.8.
  • Figure 3: Feature clipping at different value. Points to the right bottom cornor indicate better performance. The experiment is conducted on ResNet-50 on CIFAR-10.
  • Figure 4: The derivative of $\Delta H$ Different color indicates different choice of clipping hyperparameter $c$.
  • Figure 5: Average feature value of samples with high or low calibration error for feature values in all 2048 dimension. The experiment is conducted on ResNet-50 on CIFAR-10.

Theorems & Definitions (1)

  • Theorem 1