Data structure > labels? Unsupervised heuristics for SVM hyperparameter estimation
Michał Cholewa, Michał Romaszewski, Przemysław Głomb
TL;DR
This work tackles the computational burden of tuning SVM hyperparameters by evaluating unsupervised heuristics that estimate $C$ and $\gamma$ from data without labels. It surveys multiple existing heuristics (Smola, Chapelle & Zien, Jaakkola & Soares, Gelbart, covtrace) and introduces a new extension (MC) for the $C$ parameter, combining them with Gaussian RBF kernels. Through experiments on 31 KEEL datasets and Bayesian analysis with a 1% rope for practical equivalence, the authors show that covtrace+MC often matches GSCV in accuracy while delivering 100–200× faster computation, with several heuristics achieving practical equivalence on many datasets. The findings suggest that unsupervised SVM calibration is viable for rapid deployment on resource-constrained platforms and limited-label settings, though performance can degrade when clustering assumptions do not hold.
Abstract
Classification is one of the main areas of pattern recognition research, and within it, Support Vector Machine (SVM) is one of the most popular methods outside of field of deep learning -- and a de-facto reference for many Machine Learning approaches. Its performance is determined by parameter selection, which is usually achieved by a time-consuming grid search cross-validation procedure (GSCV). That method, however relies on the availability and quality of labelled examples and thus, when those are limited can be hindered. To address that problem, there exist several unsupervised heuristics that take advantage of the characteristics of the dataset for selecting parameters instead of using class label information. While an order of magnitude faster, they are scarcely used under the assumption that their results are significantly worse than those of grid search. To challenge that assumption, we have proposed improved heuristics for SVM parameter selection and tested it against GSCV and state of the art heuristics on over 30 standard classification datasets. The results show not only its advantage over state-of-art heuristics but also that it is statistically no worse than GSCV.
