Table of Contents
Fetching ...

Enhanced Feature Based Granular Ball Twin Support Vector Machine

A. Quadir, M. Sajid, M. Tanveer, P. N. Suganthan

TL;DR

The paper addresses classification under noise and large-scale conditions by introducing EF-GBTSVM, which uses granular balls as inputs and maps their centers into a randomized RVFL-like feature space before applying TSVM in the enhanced space. This two-stage approach—granular-ball representation and feature augmentation—improves robustness to noise and outliers while enabling scalable training on big datasets. Empirical results across UCI/KEEL, NDC, and medical datasets (ADNI, Schizophrenia) show EF-GBTSVM achieving superior generalization compared with baselines, supported by Friedman and Nemenyi statistical tests and ablation studies. The work provides a practical, efficient framework for large-scale, noisy classification with potential extensions to multiclass tasks and the incorporation of robust loss and regularization strategies.

Abstract

In this paper, we propose enhanced feature based granular ball twin support vector machine (EF-GBTSVM). EF-GBTSVM employs the coarse granularity of granular balls (GBs) as input rather than individual data samples. The GBs are mapped to the feature space of the hidden layer using random projection followed by the utilization of a non-linear activation function. The concatenation of original and hidden features derived from the centers of GBs gives rise to an enhanced feature space, commonly referred to as the random vector functional link (RVFL) space. This space encapsulates nuanced feature information to GBs. Further, we employ twin support vector machine (TSVM) in the RVFL space for classification. TSVM generates the two non-parallel hyperplanes in the enhanced feature space, which improves the generalization performance of the proposed EF-GBTSVM model. Moreover, the coarser granularity of the GBs enables the proposed EF-GBTSVM model to exhibit robustness to resampling, showcasing reduced susceptibility to the impact of noise and outliers. We undertake a thorough evaluation of the proposed EF-GBTSVM model on benchmark UCI and KEEL datasets. This evaluation encompasses scenarios with and without the inclusion of label noise. Moreover, experiments using NDC datasets further emphasize the proposed model's ability to handle large datasets. Experimental results, supported by thorough statistical analyses, demonstrate that the proposed EF-GBTSVM model significantly outperforms the baseline models in terms of generalization capabilities, scalability, and robustness. The source code for the proposed EF-GBTSVM model, along with additional results and further details, can be accessed at https://github.com/mtanveer1/EF-GBTSVM.

Enhanced Feature Based Granular Ball Twin Support Vector Machine

TL;DR

The paper addresses classification under noise and large-scale conditions by introducing EF-GBTSVM, which uses granular balls as inputs and maps their centers into a randomized RVFL-like feature space before applying TSVM in the enhanced space. This two-stage approach—granular-ball representation and feature augmentation—improves robustness to noise and outliers while enabling scalable training on big datasets. Empirical results across UCI/KEEL, NDC, and medical datasets (ADNI, Schizophrenia) show EF-GBTSVM achieving superior generalization compared with baselines, supported by Friedman and Nemenyi statistical tests and ablation studies. The work provides a practical, efficient framework for large-scale, noisy classification with potential extensions to multiclass tasks and the incorporation of robust loss and regularization strategies.

Abstract

In this paper, we propose enhanced feature based granular ball twin support vector machine (EF-GBTSVM). EF-GBTSVM employs the coarse granularity of granular balls (GBs) as input rather than individual data samples. The GBs are mapped to the feature space of the hidden layer using random projection followed by the utilization of a non-linear activation function. The concatenation of original and hidden features derived from the centers of GBs gives rise to an enhanced feature space, commonly referred to as the random vector functional link (RVFL) space. This space encapsulates nuanced feature information to GBs. Further, we employ twin support vector machine (TSVM) in the RVFL space for classification. TSVM generates the two non-parallel hyperplanes in the enhanced feature space, which improves the generalization performance of the proposed EF-GBTSVM model. Moreover, the coarser granularity of the GBs enables the proposed EF-GBTSVM model to exhibit robustness to resampling, showcasing reduced susceptibility to the impact of noise and outliers. We undertake a thorough evaluation of the proposed EF-GBTSVM model on benchmark UCI and KEEL datasets. This evaluation encompasses scenarios with and without the inclusion of label noise. Moreover, experiments using NDC datasets further emphasize the proposed model's ability to handle large datasets. Experimental results, supported by thorough statistical analyses, demonstrate that the proposed EF-GBTSVM model significantly outperforms the baseline models in terms of generalization capabilities, scalability, and robustness. The source code for the proposed EF-GBTSVM model, along with additional results and further details, can be accessed at https://github.com/mtanveer1/EF-GBTSVM.
Paper Structure (20 sections, 17 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 17 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Flowchart of the proposed EF-GBTSVM model. The entire dataset can be considered as a granular ball (GB). First, calculate the center "$C$" and the label of the GBs. Then, compute the hidden layer matrix "$Z$" for the generated GB center, with weights and biases randomly initialized. Next, obtain the enhanced features (RVFL features) by concatenating the hidden feature "$Z$" with the center matrix "$C$". Finally, use TSVM to classify the data points into $+1$ and $-1$ classes, respectively.
  • Figure 2: The effect of hyperparameter $(d_1, d_2)$ tuning on the accuracy (ACC) of some UCI and KEEL datasets on the performance of EF-GBTSVM.
  • Figure 3: Effect of parameter "Act fun" on the performance of the proposed EF-GBTSVM model.
  • Figure 4: Effect of parameter $h$ on the performance of the proposed EF-GBTSVM model.