Table of Contents
Fetching ...

Gap Safe Screening Rules for Fast Training of Robust Support Vector Machines under Feature Noise

Tan-Hau Nguyen, Thu-Le Tran, Kien Trung Nguyen

Abstract

Robust Support Vector Machines (R-SVMs) address feature noise by adopting a worst-case robust formulation that explicitly incorporates uncertainty sets into training. While this robustness improves reliability, it also leads to increased computational cost. In this work, we develop safe sample screening rules for R-SVMs that reduce the training complexity without affecting the optimal solution. To the best of our knowledge, this is the first study to apply safe screening techniques to worst-case robust models in supervised machine learning. Our approach safely identifies training samples whose uncertainty sets are guaranteed to lie entirely on either side of the margin hyperplane, thereby reducing the problem size and accelerating optimization. Owing to the nonstandard structure of R-SVMs, the proposed screening rules are derived from the Lagrangian duality rather than the Fenchel-Rockafellar duality commonly used in recent methods. Based on this analysis, we first establish an ideal screening rule, and then derive a practical rule by adapting GAP-based safe regions to the robust setting. Experiments demonstrate that the proposed method significantly reduces training time while preserving classification accuracy.

Gap Safe Screening Rules for Fast Training of Robust Support Vector Machines under Feature Noise

Abstract

Robust Support Vector Machines (R-SVMs) address feature noise by adopting a worst-case robust formulation that explicitly incorporates uncertainty sets into training. While this robustness improves reliability, it also leads to increased computational cost. In this work, we develop safe sample screening rules for R-SVMs that reduce the training complexity without affecting the optimal solution. To the best of our knowledge, this is the first study to apply safe screening techniques to worst-case robust models in supervised machine learning. Our approach safely identifies training samples whose uncertainty sets are guaranteed to lie entirely on either side of the margin hyperplane, thereby reducing the problem size and accelerating optimization. Owing to the nonstandard structure of R-SVMs, the proposed screening rules are derived from the Lagrangian duality rather than the Fenchel-Rockafellar duality commonly used in recent methods. Based on this analysis, we first establish an ideal screening rule, and then derive a practical rule by adapting GAP-based safe regions to the robust setting. Experiments demonstrate that the proposed method significantly reduces training time while preserving classification accuracy.

Paper Structure

This paper contains 15 sections, 9 theorems, 47 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Consider a training example $i$, the robust loss admits the following closed-form expression:

Figures (5)

  • Figure 1: Separating hyperplane of R-SVM on linearly separable data
  • Figure 2: Training time comparison with and without safe screening for different values of $C$ and $\rho$ on the Breast Cancer Wisconsin dataset.
  • Figure 3: Iteration-wise dynamics of the safe screening elimination ratio for different values of the uncertainty radius $\rho$ on the Breast Cancer Wisconsin dataset.
  • Figure 4: Training time comparison with and without safe screening for different values of $C$ and $\rho$ on the Spambase Email dataset.
  • Figure 5: Iteration-wise dynamics of the safe screening elimination ratio for different values of the uncertainty radius $\rho$ on the Spambase Email dataset.

Theorems & Definitions (19)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Remark 1
  • Theorem 2
  • proof
  • Lemma 2
  • proof
  • Theorem 3: Ideal safe sample screening
  • ...and 9 more