Table of Contents
Fetching ...

Smooth Ranking SVM via Cutting-Plane Method

Erhan Can Ozcan, Berk Görgülü, Mustafa G. Baydogan, Ioannis Ch. Paschalidis

TL;DR

This work targets imbalanced binary classification by directly maximizing AUC during training. It extends Ranking SVM with a prototype-learning, column-generation framework and introduces smoothing to restrict weight fluctuations across iterations, yielding more stable test performance. Empirical results across 73 datasets show competitive AUC, with the Smooth Ranking-CG Prototype achieving the best performance on 25 datasets and yielding sparser models. The approach offers a scalable, robust alternative to standard regularized Ranking SVM formulations for AUC-centric learning.

Abstract

The most popular classification algorithms are designed to maximize classification accuracy during training. However, this strategy may fail in the presence of class imbalance since it is possible to train models with high accuracy by overfitting to the majority class. On the other hand, the Area Under the Curve (AUC) is a widely used metric to compare classification performance of different algorithms when there is a class imbalance, and various approaches focusing on the direct optimization of this metric during training have been proposed. Among them, SVM-based formulations are especially popular as this formulation allows incorporating different regularization strategies easily. In this work, we develop a prototype learning approach that relies on cutting-plane method, similar to Ranking SVM, to maximize AUC. Our algorithm learns simpler models by iteratively introducing cutting planes, thus overfitting is prevented in an unconventional way. Furthermore, it penalizes the changes in the weights at each iteration to avoid large jumps that might be observed in the test performance, thus facilitating a smooth learning process. Based on the experiments conducted on 73 binary classification datasets, our method yields the best test AUC in 25 datasets among its relevant competitors.

Smooth Ranking SVM via Cutting-Plane Method

TL;DR

This work targets imbalanced binary classification by directly maximizing AUC during training. It extends Ranking SVM with a prototype-learning, column-generation framework and introduces smoothing to restrict weight fluctuations across iterations, yielding more stable test performance. Empirical results across 73 datasets show competitive AUC, with the Smooth Ranking-CG Prototype achieving the best performance on 25 datasets and yielding sparser models. The approach offers a scalable, robust alternative to standard regularized Ranking SVM formulations for AUC-centric learning.

Abstract

The most popular classification algorithms are designed to maximize classification accuracy during training. However, this strategy may fail in the presence of class imbalance since it is possible to train models with high accuracy by overfitting to the majority class. On the other hand, the Area Under the Curve (AUC) is a widely used metric to compare classification performance of different algorithms when there is a class imbalance, and various approaches focusing on the direct optimization of this metric during training have been proposed. Among them, SVM-based formulations are especially popular as this formulation allows incorporating different regularization strategies easily. In this work, we develop a prototype learning approach that relies on cutting-plane method, similar to Ranking SVM, to maximize AUC. Our algorithm learns simpler models by iteratively introducing cutting planes, thus overfitting is prevented in an unconventional way. Furthermore, it penalizes the changes in the weights at each iteration to avoid large jumps that might be observed in the test performance, thus facilitating a smooth learning process. Based on the experiments conducted on 73 binary classification datasets, our method yields the best test AUC in 25 datasets among its relevant competitors.
Paper Structure (8 sections, 12 equations, 1 figure, 9 tables, 1 algorithm)

This paper contains 8 sections, 12 equations, 1 figure, 9 tables, 1 algorithm.

Figures (1)

  • Figure 1: Changes in the Test AUC with the number of iterations (each newly added point to the set $\mathcal{Q}$) for the Ranking-CG Prototype, Unbounded Ranking-CG Prototype and Smooth Ranking-CG Prototype methods.