L0-Regularized Quadratic Surface Support Vector Machines

Ahmad Mousavi; Ramin Zandvakili; Zheming Gao

L0-Regularized Quadratic Surface Support Vector Machines

Ahmad Mousavi, Ramin Zandvakili, Zheming Gao

TL;DR

This work develops a penalty decomposition algorithm capable of producing solutions that provably satisfy the first-order Lu-Zhang optimality conditions and shows that the subproblems arising within the algorithm either admit closed-form solutions or can be solved efficiently through dual formulations, which contributes to the method's overall effectiveness.

Abstract

Kernel-free quadratic surface support vector machines (QSVM) have recently gained traction due to their flexibility in modeling nonlinear decision boundaries without relying on kernel functions. However, the introduction of a full quadratic classifier significantly increases the number of model parameters, scaling quadratically with data dimensionality, which often leads to overfitting and makes interpretation difficult. To address these challenges, we propose sparse variants of the QSVM by enforcing a cardinality constraint on the model parameters. While enhancing generalization and promoting sparsity, leveraging the $\ell_0$-norm inevitably incurs additional computational complexity. To tackle this, we develop a penalty decomposition algorithm capable of producing solutions that provably satisfy the first-order Lu-Zhang optimality conditions. We show that the subproblems arising within the algorithm either admit closed-form solutions or can be solved efficiently through dual formulations, which contributes to the method's overall effectiveness. Besides, we analyze the convergence behavior of the algorithm under both loss settings. In addition, the numerical experiments on public benchmark datasets indicate that the proposed model is competitive with commonly used SVM variants and produces sparse solutions as expected. Moreover, its strong performance on real-world credit datasets demonstrates its potential for credit scoring applications.

L0-Regularized Quadratic Surface Support Vector Machines

TL;DR

Abstract

-norm inevitably incurs additional computational complexity. To tackle this, we develop a penalty decomposition algorithm capable of producing solutions that provably satisfy the first-order Lu-Zhang optimality conditions. We show that the subproblems arising within the algorithm either admit closed-form solutions or can be solved efficiently through dual formulations, which contributes to the method's overall effectiveness. Besides, we analyze the convergence behavior of the algorithm under both loss settings. In addition, the numerical experiments on public benchmark datasets indicate that the proposed model is competitive with commonly used SVM variants and produces sparse solutions as expected. Moreover, its strong performance on real-world credit datasets demonstrates its potential for credit scoring applications.

Paper Structure (12 sections, 2 theorems, 35 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 2 theorems, 35 equations, 4 figures, 4 tables, 1 algorithm.

Introduction
Preliminaries and Related Work
$\ell_0$-Regularized Quadratic Surface SVMs and A Penalty Decompotion Algorithm
$\ell_0$-Regularized Quadratic Surface SVM Model
Least-squares $\ell_0$-Regularized Quadratic Surface SVM Model
Convergence of $\ell_0$-Regularized QSVM Penalty Decomposition Algorithm
Numerical Experiments
Benchmark Datasets
Application: Credit Scoring
Conclusion
Tuning Parameters
Data source

Key Result

Corollary 3.1

For the hinge loss function, step 6 of Algorithm algo: L0-penalty can be obtained by solving dual-Pz-hinge and then applying sol-Pz-hinge.

Figures (4)

Figure 1: The sparsity of optimal coefficients $W^*$ and $b^*$ by three sparse quadratic models on the Immunotherapy dataset.
Figure 2: The trend of accuracy against parameters $C$ and $k$ by the proposed $\ell_0$ regularized QSVM models on the Immunotherapy and Ecoli datasets.
Figure 3: The trend of classification accuracy from \ref{['LS-L0-QSVM']} changes as $k$ increases. Parameter $\mathcal{C}$ is fixed to be $1, 10$ or $100$.
Figure 4: $W^*$ and $b^*$ are sparse optimal solutions provided by \ref{['LS-L0-QSVM']} with non-zero entries colored in blue. Significant coefficients by LR that are statistically significant at the 95% confidence level are highlighted in green.

Theorems & Definitions (3)

Definition 2.1
Corollary 3.1
Corollary 3.2

L0-Regularized Quadratic Surface Support Vector Machines

TL;DR

Abstract

L0-Regularized Quadratic Surface Support Vector Machines

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (3)