Multi-class Support Vector Machine with Maximizing Minimum Margin
Zhezheng Hao, Feiping Nie, Rong Wang
TL;DR
This work tackles multi-class classification by reframing SVM to maximize the minimum margin across all class pairs. It introduces the $M^3$SVM framework, which uses pairwise losses and a tunable $p$-norm regularizer to enlarge the margins, with $p\to\infty$ linking to a margin-based SRM objective $\sum_{k<l} \|\mathbf{w}_k - \mathbf{w}_l\|_2^p$. The method yields a strictly convex, smooth optimization via a hinge surrogate and Adam updates, and it connects to $\ell_2$ regularization when $p=2$, while providing broader margin control for other $p$. The authors extend the approach to softmax loss (ISM3) for neural networks and validate the method across diverse datasets, showing consistent improvements over standard multi-class SVM methods and demonstrating favorable generalization and convergence properties.
Abstract
Support Vector Machine (SVM) stands out as a prominent machine learning technique widely applied in practical pattern recognition tasks. It achieves binary classification by maximizing the "margin", which represents the minimum distance between instances and the decision boundary. Although many efforts have been dedicated to expanding SVM for multi-class case through strategies such as one versus one and one versus the rest, satisfactory solutions remain to be developed. In this paper, we propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin. Adhering to this concept, we embrace a new formulation that imparts heightened flexibility to multi-class SVM. Furthermore, the correlations between the proposed method and multiple forms of multi-class SVM are analyzed. The proposed regularizer, akin to the concept of "margin", can serve as a seamless enhancement over the softmax in deep learning, providing guidance for network parameter learning. Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods.
