PostHoc FREE Calibrating on Kolmogorov Arnold Networks
Wenhao Liang, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen
TL;DR
The paper tackles miscalibration in Kolmogorov-Arnold Networks (KANs), which use spline-based, edge-focused activations that can yield overconfident predictions in dense regions and underconfident ones in sparse areas. It introduces Temperature-Scaled Loss (TSL), a training-time objective that jointly optimizes the network parameters and a learnable temperature parameter $\tau$ to directly shape the predictive distribution, preserving strict propriety of the base loss. The authors provide theoretical guarantees (local convergence and reduction in calibration error) and empirical evidence across diverse vision benchmarks showing that TSL consistently reduces calibration error (ECE and variants) while maintaining competitive accuracy. They also analyze how KAN hyperparameters influence calibration and demonstrate that TSL mitigates grid-induced miscalibration without requiring post-hoc adjustments, offering practical guidance for spline-based networks and potential applicability to other architectures. The work contributes a principled, effective approach to calibration in flexible spline-based models, with implications for safety-critical and risk-sensitive applications.
Abstract
Kolmogorov Arnold Networks (KANs) are neural architectures inspired by the Kolmogorov Arnold representation theorem that leverage B Spline parameterizations for flexible, locally adaptive function approximation. Although KANs can capture complex nonlinearities beyond those modeled by standard MultiLayer Perceptrons (MLPs), they frequently exhibit miscalibrated confidence estimates manifesting as overconfidence in dense data regions and underconfidence in sparse areas. In this work, we systematically examine the impact of four critical hyperparameters including Layer Width, Grid Order, Shortcut Function, and Grid Range on the calibration of KANs. Furthermore, we introduce a novel TemperatureScaled Loss (TSL) that integrates a temperature parameter directly into the training objective, dynamically adjusting the predictive distribution during learning. Both theoretical analysis and extensive empirical evaluations on standard benchmarks demonstrate that TSL significantly reduces calibration errors, thereby improving the reliability of probabilistic predictions. Overall, our study provides actionable insights into the design of spline based neural networks and establishes TSL as a robust loss solution for enhancing calibration.
