MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution Shift
Dexter Neo, Stefan Winkler, Tsuhan Chen
TL;DR
This work introduces MaxEnt Loss, a constrained maximum-entropy regularizer for calibration under out-of-distribution shifts. By maximizing entropy subject to training-derived mean and/or variance constraints, and integrating these through Lagrange multipliers, the method yields three end-to-end forms (Mean M, Variance V, Mean+Variance M+V) that improve calibration without sacrificing accuracy. It explicitly links MaxEnt to Focal loss and demonstrates, through extensive synthetic and real-world OOD benchmarks, that MaxEnt achieves state-of-the-art calibration across diverse datasets and remains compatible with pre- and post-calibration techniques such as label smoothing and temperature scaling. The approach also analyzes the role of local constraints and feature-norm ordering, and discusses limitations and directions for future work, including adaptive multiplier schemes to enhance robustness to unseen shifts.
Abstract
We present a new loss function that addresses the out-of-distribution (OOD) calibration problem. While many objective functions have been proposed to effectively calibrate models in-distribution, our findings show that they do not always fare well OOD. Based on the Principle of Maximum Entropy, we incorporate helpful statistical constraints observed during training, delivering better model calibration without sacrificing accuracy. We provide theoretical analysis and show empirically that our method works well in practice, achieving state-of-the-art calibration on both synthetic and real-world benchmarks.
