Table of Contents
Fetching ...

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

Linkai Luo, Qiaoling Yang, Hong Peng, Yiding Wang, Ziyang Chen

TL;DR

The paper tackles the time-cost bottleneck of kernel-parameter tuning in SVC, proposing a minimax framework, MaxMin-L2-SVC-NCH, that jointly trains SVC (via L2-SVC-NCH) and selects the Gaussian kernel parameter $\gamma$ without cross-validation. It develops a projected gradient algorithm (PGA) for the minimization and a gradient-ascent with dynamic learning rate (GA-DLR) for the maximization, combining them into a gradient-based GB solver. Empirical results on ten public two-class datasets show that MaxMin-L2-SVC-NCH dramatically reduces the number of trained models (average 8.2) while maintaining competitive test accuracy, outperforming several baselines in training efficiency and stability; moreover, SMO is shown to be a special case of PGA, underscoring methodological flexibility. The approach offers a scalable, wrapper-like mechanism for SVC hyperparameter tuning that avoids exhaustive CV and can inform kernel-parameter selection in practice, with potential extensions to multiclass and regression tasks.

Abstract

The selection of Gaussian kernel parameters plays an important role in the applications of support vector classification (SVC). A commonly used method is the k-fold cross validation with grid search (CV), which is extremely time-consuming because it needs to train a large number of SVC models. In this paper, a new approach is proposed to train SVC and optimize the selection of Gaussian kernel parameters. We first formulate the training and the parameter selection of SVC as a minimax optimization problem named as MaxMin-L2-SVC-NCH, in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal Gaussian kernel parameters. A lower time complexity can be expected in MaxMin-L2-SVC-NCH because CV is not needed. We then propose a projected gradient algorithm (PGA) for the training of L2-SVC-NCH. It is revealed that the famous sequential minimal optimization (SMO) algorithm is a special case of the PGA. Thus, the PGA can provide more flexibility than the SMO. Furthermore, the solution of the maximization problem is done by a gradient ascent algorithm with dynamic learning rate. The comparative experiments between MaxMin-L2-SVC-NCH and the previous best approaches on public datasets show that MaxMin-L2-SVC-NCH greatly reduces the number of models to be trained while maintaining competitive test accuracy. These findings indicate that MaxMin-L2-SVC-NCH is a better choice for SVC tasks.

MaxMin-L2-SVC-NCH: A Novel Approach for Support Vector Classifier Training and Parameter Selection

TL;DR

The paper tackles the time-cost bottleneck of kernel-parameter tuning in SVC, proposing a minimax framework, MaxMin-L2-SVC-NCH, that jointly trains SVC (via L2-SVC-NCH) and selects the Gaussian kernel parameter without cross-validation. It develops a projected gradient algorithm (PGA) for the minimization and a gradient-ascent with dynamic learning rate (GA-DLR) for the maximization, combining them into a gradient-based GB solver. Empirical results on ten public two-class datasets show that MaxMin-L2-SVC-NCH dramatically reduces the number of trained models (average 8.2) while maintaining competitive test accuracy, outperforming several baselines in training efficiency and stability; moreover, SMO is shown to be a special case of PGA, underscoring methodological flexibility. The approach offers a scalable, wrapper-like mechanism for SVC hyperparameter tuning that avoids exhaustive CV and can inform kernel-parameter selection in practice, with potential extensions to multiclass and regression tasks.

Abstract

The selection of Gaussian kernel parameters plays an important role in the applications of support vector classification (SVC). A commonly used method is the k-fold cross validation with grid search (CV), which is extremely time-consuming because it needs to train a large number of SVC models. In this paper, a new approach is proposed to train SVC and optimize the selection of Gaussian kernel parameters. We first formulate the training and the parameter selection of SVC as a minimax optimization problem named as MaxMin-L2-SVC-NCH, in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal Gaussian kernel parameters. A lower time complexity can be expected in MaxMin-L2-SVC-NCH because CV is not needed. We then propose a projected gradient algorithm (PGA) for the training of L2-SVC-NCH. It is revealed that the famous sequential minimal optimization (SMO) algorithm is a special case of the PGA. Thus, the PGA can provide more flexibility than the SMO. Furthermore, the solution of the maximization problem is done by a gradient ascent algorithm with dynamic learning rate. The comparative experiments between MaxMin-L2-SVC-NCH and the previous best approaches on public datasets show that MaxMin-L2-SVC-NCH greatly reduces the number of models to be trained while maintaining competitive test accuracy. These findings indicate that MaxMin-L2-SVC-NCH is a better choice for SVC tasks.
Paper Structure (20 sections, 53 equations, 2 figures, 6 tables, 3 algorithms)

This paper contains 20 sections, 53 equations, 2 figures, 6 tables, 3 algorithms.

Figures (2)

  • Figure 1: The changes of $f'(\gamma)$ on the representative datasets. The vertical coordinate is logarithmic coordinate.
  • Figure 2: The ratios of inter-class distance and intra-class distance on the representative datasets.