Consistency Conditions for Differentiable Surrogate Losses
Drona Khurana, Anish Thilagar, Dhamma Kimpara, Rafael Frongillo
TL;DR
This work analyzes the statistical consistency of differentiable surrogate losses for discrete prediction by connecting calibration to indirect elicitation (IE). It proves that IE is equivalent to calibration in one dimension ($d=1$) but not in higher dimensions, motivating a stronger notion—strong indirect elicitation (strong IE)—which is easier to verify and, under mild conditions, implies calibration; for surrogates with strongly convex components, strong IE is necessary and sufficient for calibration. The results yield constructive link-function designs and enable efficient construction of 1D calibrated surrogates for orderable targets, including ordinal regression, thereby broadening the toolkit for designing consistent differentiable surrogates beyond polyhedral cases. Overall, the paper provides geometry-driven criteria (IE and strong IE) that simplify calibration verification and guide surrogate design with theoretical guarantees of consistency.
Abstract
The statistical consistency of surrogate losses for discrete prediction tasks is often checked via the condition of calibration. However, directly verifying calibration can be arduous. Recent work shows that for polyhedral surrogates, a less arduous condition, indirect elicitation (IE), is still equivalent to calibration. We give the first results of this type for non-polyhedral surrogates, specifically the class of convex differentiable losses. We first prove that under mild conditions, IE and calibration are equivalent for one-dimensional losses in this class. We construct a counter-example that shows that this equivalence fails in higher dimensions. This motivates the introduction of strong IE, a strengthened form of IE that is equally easy to verify. We establish that strong IE implies calibration for differentiable surrogates and is both necessary and sufficient for strongly convex, differentiable surrogates. Finally, we apply these results to a range of problems to demonstrate the power of IE and strong IE for designing and analyzing consistent differentiable surrogates.
