Table of Contents
Fetching ...

Multi-class Support Vector Machine with Maximizing Minimum Margin

Zhezheng Hao, Feiping Nie, Rong Wang

TL;DR

This work tackles multi-class classification by reframing SVM to maximize the minimum margin across all class pairs. It introduces the $M^3$SVM framework, which uses pairwise losses and a tunable $p$-norm regularizer to enlarge the margins, with $p\to\infty$ linking to a margin-based SRM objective $\sum_{k<l} \|\mathbf{w}_k - \mathbf{w}_l\|_2^p$. The method yields a strictly convex, smooth optimization via a hinge surrogate and Adam updates, and it connects to $\ell_2$ regularization when $p=2$, while providing broader margin control for other $p$. The authors extend the approach to softmax loss (ISM3) for neural networks and validate the method across diverse datasets, showing consistent improvements over standard multi-class SVM methods and demonstrating favorable generalization and convergence properties.

Abstract

Support Vector Machine (SVM) stands out as a prominent machine learning technique widely applied in practical pattern recognition tasks. It achieves binary classification by maximizing the "margin", which represents the minimum distance between instances and the decision boundary. Although many efforts have been dedicated to expanding SVM for multi-class case through strategies such as one versus one and one versus the rest, satisfactory solutions remain to be developed. In this paper, we propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin. Adhering to this concept, we embrace a new formulation that imparts heightened flexibility to multi-class SVM. Furthermore, the correlations between the proposed method and multiple forms of multi-class SVM are analyzed. The proposed regularizer, akin to the concept of "margin", can serve as a seamless enhancement over the softmax in deep learning, providing guidance for network parameter learning. Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods.

Multi-class Support Vector Machine with Maximizing Minimum Margin

TL;DR

This work tackles multi-class classification by reframing SVM to maximize the minimum margin across all class pairs. It introduces the SVM framework, which uses pairwise losses and a tunable -norm regularizer to enlarge the margins, with linking to a margin-based SRM objective . The method yields a strictly convex, smooth optimization via a hinge surrogate and Adam updates, and it connects to regularization when , while providing broader margin control for other . The authors extend the approach to softmax loss (ISM3) for neural networks and validate the method across diverse datasets, showing consistent improvements over standard multi-class SVM methods and demonstrating favorable generalization and convergence properties.

Abstract

Support Vector Machine (SVM) stands out as a prominent machine learning technique widely applied in practical pattern recognition tasks. It achieves binary classification by maximizing the "margin", which represents the minimum distance between instances and the decision boundary. Although many efforts have been dedicated to expanding SVM for multi-class case through strategies such as one versus one and one versus the rest, satisfactory solutions remain to be developed. In this paper, we propose a novel method for multi-class SVM that incorporates pairwise class loss considerations and maximizes the minimum margin. Adhering to this concept, we embrace a new formulation that imparts heightened flexibility to multi-class SVM. Furthermore, the correlations between the proposed method and multiple forms of multi-class SVM are analyzed. The proposed regularizer, akin to the concept of "margin", can serve as a seamless enhancement over the softmax in deep learning, providing guidance for network parameter learning. Empirical evaluations demonstrate the effectiveness and superiority of our proposed method over existing multi-classification methods.
Paper Structure (23 sections, 21 theorems, 65 equations, 10 figures, 4 tables)

This paper contains 23 sections, 21 theorems, 65 equations, 10 figures, 4 tables.

Key Result

Lemma 1

Assume $g_{1}(\mathbf{z}),g_{2}(\mathbf{z}),\cdots,g_{m}(\mathbf{z})$ are given real functions. The following two optimization problem is equivalent when $p\rightarrow \infty$:

Figures (10)

  • Figure 1: Illustrative figures.
  • Figure 2: Convergence of the objective function value.
  • Figure 3: The effect of parameter $p$ on experimental results.
  • Figure 4: Study of $\lambda$.
  • Figure 5: Accuracy curves with iterations on SVHN.
  • ...and 5 more figures

Theorems & Definitions (32)

  • Lemma 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 2
  • Theorem 4
  • Lemma 3
  • Theorem 5
  • Theorem 6
  • Lemma 1
  • ...and 22 more