A Novel Loss Function-based Support Vector Machine for Binary Classification

Yan Li; Liping Zhang

A Novel Loss Function-based Support Vector Machine for Binary Classification

Yan Li, Liping Zhang

TL;DR

The paper introduces the Slide loss $\ell_s$ to address insufficient penalization in classic SVM losses, enabling graded penalties for samples inside the margin and near the decision boundary. It develops the $\ell_s$-SVM, derives the subdifferential and proximal operator of $\ell_s$, and defines proximal stationary points to establish first-order optimality conditions. An efficient $\ell_s$-ADMM algorithm with a behavior-based working set is proposed and analyzed, including convergence to proximal stationary points and local minima. Extensive experiments on seven real-world datasets demonstrate improved robustness and generalization over multiple SVM solvers, especially under label noise.

Abstract

The previous support vector machine(SVM) including $0/1$ loss SVM, hinge loss SVM, ramp loss SVM, truncated pinball loss SVM, and others, overlooked the degree of penalty for the correctly classified samples within the margin. This oversight affects the generalization ability of the SVM classifier to some extent. To address this limitation, from the perspective of confidence margin, we propose a novel Slide loss function ($\ell_s$) to construct the support vector machine classifier($\ell_s$-SVM). By introducing the concept of proximal stationary point, and utilizing the property of Lipschitz continuity, we derive the first-order optimality conditions for $\ell_s$-SVM. Based on this, we define the $\ell_s$ support vectors and working set of $\ell_s$-SVM. To efficiently handle $\ell_s$-SVM, we devise a fast alternating direction method of multipliers with the working set ($\ell_s$-ADMM), and provide the convergence analysis. The numerical experiments on real world datasets confirm the robustness and effectiveness of the proposed method.

A Novel Loss Function-based Support Vector Machine for Binary Classification

TL;DR

The paper introduces the Slide loss

to address insufficient penalization in classic SVM losses, enabling graded penalties for samples inside the margin and near the decision boundary. It develops the

-SVM, derives the subdifferential and proximal operator of

, and defines proximal stationary points to establish first-order optimality conditions. An efficient

-ADMM algorithm with a behavior-based working set is proposed and analyzed, including convergence to proximal stationary points and local minima. Extensive experiments on seven real-world datasets demonstrate improved robustness and generalization over multiple SVM solvers, especially under label noise.

Abstract

The previous support vector machine(SVM) including

loss SVM, hinge loss SVM, ramp loss SVM, truncated pinball loss SVM, and others, overlooked the degree of penalty for the correctly classified samples within the margin. This oversight affects the generalization ability of the SVM classifier to some extent. To address this limitation, from the perspective of confidence margin, we propose a novel Slide loss function (

) to construct the support vector machine classifier(

-SVM). By introducing the concept of proximal stationary point, and utilizing the property of Lipschitz continuity, we derive the first-order optimality conditions for

-SVM. Based on this, we define the

support vectors and working set of

-SVM. To efficiently handle

-SVM, we devise a fast alternating direction method of multipliers with the working set (

-ADMM), and provide the convergence analysis. The numerical experiments on real world datasets confirm the robustness and effectiveness of the proposed method.

Paper Structure (9 sections, 6 theorems, 64 equations, 6 tables, 1 algorithm)

This paper contains 9 sections, 6 theorems, 64 equations, 6 tables, 1 algorithm.

Introduction
Theoretical analysis for $\ell_s$ loss function
Optimality conditions for $\ell_s$-SVM
Fast Algorithm
$\ell_s$ Support Vectors
$\ell_s$-ADMM Framework
Convergence Analysis
Numerical Experiments
Conclusion

Key Result

Proposition 2

Given $\epsilon$ and $v$, the subdifferential of the $\ell_s$ loss function $\ell_s$ at $t\in\mathbb{R}$ is:

Theorems & Definitions (14)

Definition 1
Proposition 2
proof
Proposition 3
proof
Definition 4
Theorem 5
proof
Theorem 6
proof
...and 4 more

A Novel Loss Function-based Support Vector Machine for Binary Classification

TL;DR

Abstract

A Novel Loss Function-based Support Vector Machine for Binary Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (14)