When does a predictor know its own loss?
Aravind Gollakota, Parikshit Gopalan, Aayush Karan, Charlotte Peale, Udi Wieder
TL;DR
This work establishes a tight link between loss prediction and multicalibration in binary classification. It defines a hierarchy of loss predictors, with the self-entropy predictor as a baseline, and shows that achieving nontrivial loss-prediction advantage is equivalent to auditing for multicalibration violations, across prediction-only, input-aware, and representation-aware settings. The authors prove formal theorems connecting loss-prediction improvements to multicalibration errors, and extend the results to multiple losses via finite-basis methods, enabling efficient multicalibration with respect to rich loss classes. Empirical results on UCI datasets corroborate the theory, showing that loss-prediction advantage grows with multicalibration error and particularly benefits subgroups with higher calibration errors. The findings offer a practical auditing approach: train regression-based loss predictors to reveal and address calibration gaps, potentially guiding model improvement and safer deployment in diverse downstream tasks.
Abstract
Given a predictor and a loss function, how well can we predict the loss that the predictor will incur on an input? This is the problem of loss prediction, a key computational task associated with uncertainty estimation for a predictor. In a classification setting, a predictor will typically predict a distribution over labels and hence have its own estimate of the loss that it will incur, given by the entropy of the predicted distribution. Should we trust this estimate? In other words, when does the predictor know what it knows and what it does not know? In this work we study the theoretical foundations of loss prediction. Our main contribution is to establish tight connections between nontrivial loss prediction and certain forms of multicalibration, a multigroup fairness notion that asks for calibrated predictions across computationally identifiable subgroups. Formally, we show that a loss predictor that is able to improve on the self-estimate of a predictor yields a witness to a failure of multicalibration, and vice versa. This has the implication that nontrivial loss prediction is in effect no easier or harder than auditing for multicalibration. We support our theoretical results with experiments that show a robust positive correlation between the multicalibration error of a predictor and the efficacy of training a loss predictor.
