Enhancing Robustness and Efficiency of Least Square Twin SVM via Granular Computing
M. Tanveer, R. K. Sharma, A. Quadir, M. Sajid
TL;DR
This work tackles robustness and efficiency gaps in least squares twin SVM (LSTSVM) by introducing granular-ball representations (GBLSTSVM) and a large-scale variant (LS-GBLSTSVM) that incorporate structural risk minimization through regularization. The methods extend to linear and nonlinear kernels, encoding granular-ball centers and radii into the optimization framework, and employ closed-form solutions or SMO to avoid costly matrix inversions. Across 34 UCI/KEEL datasets and large NDC datasets, the proposed approaches consistently surpass baseline models in accuracy and exhibit substantial speedups (up to 1000x) while maintaining robustness to label noise. The results demonstrate improved generalization, scalability to millions of samples, and a novel dimensionality-reduction perspective via granular-ball representations.
Abstract
In the domain of machine learning, least square twin support vector machine (LSTSVM) stands out as one of the state-of-the-art models. However, LSTSVM suffers from sensitivity to noise and outliers, overlooking the SRM principle and instability in resampling. Moreover, its computational complexity and reliance on matrix inversions hinder the efficient processing of large datasets. As a remedy to the aforementioned challenges, we propose the robust granular ball LSTSVM (GBLSTSVM). GBLSTSVM is trained using granular balls instead of original data points. The core of a granular ball is found at its center, where it encapsulates all the pertinent information of the data points within the ball of specified radius. To improve scalability and efficiency, we further introduce the large-scale GBLSTSVM (LS-GBLSTSVM), which incorporates the SRM principle through regularization terms. Experiments are performed on UCI, KEEL, and NDC benchmark datasets; both the proposed GBLSTSVM and LS-GBLSTSVM models consistently outperform the baseline models.
