Unified Binary and Multiclass Margin-Based Classification
Yutong Wang, Clayton Scott
TL;DR
This work develops a unifying framework for binary and multiclass margin-based classification by expressing a wide class of multiclass losses as permutation-equivariant relative-margin losses (PERM) with a symmetric template $\psi$. Central to the framework is the matrix-label-code, which links relative margins to loss values, showing that PERM losses are exactly those expressible via $\mathcal{L}_y(\mathbf{v})=\psi(\bm{\Upsilon}_y\mathbf{D}\mathbf{v})$. The authors extend binary margin-calibration results to multiclass cases under the notion of total regularity, proving that totally regular PERM losses are classification-calibrated and that sums of such losses preserve CC; they also demonstrate CC for Fenchel-Young losses when the negentropy is totally regular, even without strong convexity. The paper further develops a detailed mathematical apparatus, including the uniqueness of the matrix label code and a geometrical view of the loss surface through the $F$ and $G$ mappings, to support these results and provide broader tools for multiclass loss design. These contributions offer a principled route to understanding and constructing CC multiclass losses, with practical implications for designing surrogate losses that reliably transfer performance to the 01-loss objective.
Abstract
The notion of margin loss has been central to the development and analysis of algorithms for binary classification. To date, however, there remains no consensus as to the analogue of the margin loss for multiclass classification. In this work, we show that a broad range of multiclass loss functions, including many popular ones, can be expressed in the relative margin form, a generalization of the margin form of binary losses. The relative margin form is broadly useful for understanding and analyzing multiclass losses as shown by our prior work (Wang and Scott, 2020, 2021). To further demonstrate the utility of this way of expressing multiclass losses, we use it to extend the seminal result of Bartlett et al. (2006) on classification-calibration of binary margin losses to multiclass. We then analyze the class of Fenchel-Young losses, and expand the set of these losses that are known to be classification-calibrated.
