Analyzing Cost-Sensitive Surrogate Losses via $\mathcal{H}$-calibration
Sanket Shah, Milind Tambe, Jessie Finocchiaro
TL;DR
The paper addresses whether training with cost-sensitive surrogates yields better task performance than cost-agnostic surrogates with post processing. Using the framework of $\mathcal{H}$-calibration, it shows that cross entropy can fail to be $\mathcal{H}$-consistent for cost sensitive targets, while specially designed Embeddings surrogates achieve $\mathcal{H}$-consistency under $P$-minimizable conditions. Theoretical results are complemented by experiments on synthetic data and UCI cost sensitive tasks, where Embeddings and other cost sensitive surrogates consistently outperform cost-agnostic approaches with thresholding. The findings provide both a theoretical rationale and practical guidance for adopting cost sensitive surrogate losses in small model regimes and indicate promising directions for decision focused learning and structured prediction. The work highlights the practical impact of surrogate design in reducing misclassification costs in real world settings while acknowledging distributional limitations and areas for future research.
Abstract
This paper aims to understand whether machine learning models should be trained using cost-sensitive surrogates or cost-agnostic ones (e.g., cross-entropy). Analyzing this question through the lens of $\mathcal{H}$-calibration, we find that cost-sensitive surrogates can strictly outperform their cost-agnostic counterparts when learning small models under common distributional assumptions. Since these distributional assumptions are hard to verify in practice, we also show that cost-sensitive surrogates consistently outperform cost-agnostic surrogates on classification datasets from the UCI repository. Together, these make a strong case for using cost-sensitive surrogates in practice.
