Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration

Yung-Chen Tang; Pin-Yu Chen; Tsung-Yi Ho

Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration

Yung-Chen Tang, Pin-Yu Chen, Tsung-Yi Ho

TL;DR

This paper tackles the problem of neural network calibration, aiming to align prediction confidences with true probabilities in a post-processing setting. It introduces Neural Clamping, a joint input-output calibration method that learns a universal input perturbation $\bm{\delta}$ and a temperature $T$ to recalibrate a frozen classifier, optimized on a calibration set with focal loss. The authors establish a theoretical justification showing that this joint approach maximizes entropy relative to plain temperature scaling and provide a data-driven rule for initializing $\bm{\delta}$, along with an efficient training variant. Empirically, Neural Clamping consistently achieves state-of-the-art calibration across BloodMNIST, CIFAR-100, and ImageNet over diverse architectures, often reducing both $\text{ECE}$ and $\text{AECE}$ by substantial margins and sometimes improving accuracy, demonstrating strong practical impact for reliable uncertainty estimation.

Abstract

Neural network calibration is an essential task in deep learning to ensure consistency between the confidence of model prediction and the true correctness likelihood. In this paper, we propose a new post-processing calibration method called Neural Clamping, which employs a simple joint input-output transformation on a pre-trained classifier via a learnable universal input perturbation and an output temperature scaling parameter. Moreover, we provide theoretical explanations on why Neural Clamping is provably better than temperature scaling. Evaluated on BloodMNIST, CIFAR-100, and ImageNet image recognition datasets and a variety of deep neural network models, our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods. The code is available at github.com/yungchentang/NCToolkit, and the demo is available at huggingface.co/spaces/TrustSafeAI/NCTV.

Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration

TL;DR

and a temperature

to recalibrate a frozen classifier, optimized on a calibration set with focal loss. The authors establish a theoretical justification showing that this joint approach maximizes entropy relative to plain temperature scaling and provide a data-driven rule for initializing

, along with an efficient training variant. Empirically, Neural Clamping consistently achieves state-of-the-art calibration across BloodMNIST, CIFAR-100, and ImageNet over diverse architectures, often reducing both

and

by substantial margins and sometimes improving accuracy, demonstrating strong practical impact for reliable uncertainty estimation.

Abstract

Paper Structure (23 sections, 2 theorems, 20 equations, 6 figures, 12 tables, 1 algorithm)

This paper contains 23 sections, 2 theorems, 20 equations, 6 figures, 12 tables, 1 algorithm.

Introduction
Background and Related Work
Probabilistic Characterization of Neural Network Calibration
Calibration Metrics
Post-Processing Calibration Methods
Neural Clamping
Joint Input-Output Calibration
Training Objective Function in Neural Clamping
How to Choose a Proper gamma Value in Focal Loss for Neural Clamping?
Theoretical Justification on the Advantage of Neural Clamping
Performance Evaluation
Evaluation and Implementation Details
BloodMNIST, CIFAR-100, and ImageNet Results
Additional Analysis of Neural Clamping
Conclusion
...and 8 more sections

Key Result

Lemma 3.1

For any input perturbation $\bm{\delta}$, let $f_{\theta}(\cdot)=[f_{\theta}^{(1)},\ldots,f_{\theta}^{(K)}]$ be a fixed $K$-way neural network classifier and let $\bm{z}$ be the output logits of a perturbed data input $\bm{x}+\bm{\delta}$. Then the proposed form of joint input-output calibration in

Figures (6)

Figure 1: Overview of Neural Clamping: a joint input-output post-processing calibration framework.
Figure 2: Neural Clamping on ResNet-50/ResNet-110 and Wide-ResNet-40-10 with different $\gamma$ values and the resulting expected calibration error (ECE), training loss, and entropy on BloodMNIST and CIFAR-100. When $\gamma=0$, focal loss reduces to cross entropy loss. The experiment setup is the same as Section \ref{['section4']}.
Figure 3: Comparison of random (blue) and data-driven (green) initializations for input calibration $\delta$ in Neural Clamping. The reported results are (a) Entropy, (b) ECE, (c) AECE, and (d) SCE of ResNet-110 on CIFAR-100 over 5 runs. This boxplot graphically demonstrates the spread groups of numerical data through their quartiles. The data-driven initialization shows better stability (smaller variation) than random initialization.
Figure 4: Reliability diagram of ResNet-50 on BloodMNIST with 15 bins ECE metric
Figure 5: Reliability diagram of (a) ResNet-110 and (b) Wide ResNet-40-10 on CIFAR-100 with 15 bins ECE metric
...and 1 more figures

Theorems & Definitions (4)

Lemma 3.1: optimality of joint input-output calibration
Theorem 3.2
proof
proof

Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration

TL;DR

Abstract

Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)