Adaptive Set-Mass Calibration with Conformal Prediction
Daniil Kazantsev, Mohsen Guizani, Eric Moulines, Maxim Panov, Nikita Kotelevskii
TL;DR
This work introduces cumulative mass calibration (CMC) and the CMCE metric to evaluate set-valued calibration, addressing the gap that traditional confidence- or class-wise calibration do not guarantee predictive-set validity. It builds on split conformal prediction to obtain marginally valid predictive sets and then applies two simple post-hoc procedures, mass rescaling and temperature scaling, to enforce a cumulative mass constraint at a chosen level $1-\alpha$, yielding $\alpha$-cumulative-mass-calibrated classifiers with marginal guarantees. Empirically, the proposed methods consistently improve CMCE and often other metrics (ECE, cw-ECE, NLL, Brier) on large-class benchmarks (e.g., CIFAR-100, ImageNet, iNaturalist21), and produce near-ideal cumulative-mass calibration curves, especially as the number of classes grows. The results demonstrate practical, scalable calibration with theoretical marginal guarantees, while highlighting limitations related to choosing $\alpha$, conditional calibration, and potential conservativeness in heterogeneous data settings.
Abstract
Reliable probabilities are critical in high-risk applications, yet common calibration criteria (confidence, class-wise) are only necessary for full distributional calibration, and post-hoc methods often lack distribution-free guarantees. We propose a set-based notion of calibration, cumulative mass calibration, and a corresponding empirical error measure: the Cumulative Mass Calibration Error (CMCE). We develop a new calibration procedure that starts with conformal prediction to obtain a set of labels that gives the desired coverage. We then instantiate two simple post-hoc calibrators: a mass normalization and a temperature scaling-based rule, tuned to the conformal constraint. On multi-class image benchmarks, especially with a large number of classes, our methods consistently improve CMCE and standard metrics (ECE, cw-ECE, MCE) over baselines, delivering a practical, scalable framework with theoretical guarantees.
