Snacks: a fast large-scale kernel SVM solver
Sofiane Tanji, Andrea Della Vecchia, François Glineur, Silvia Villa
TL;DR
Snacks targets the quadratic complexity $O(n^2)$ of kernel SVMs on large datasets by combining Nyström-based low-rank kernel approximation with an Accelerated Stochastic SubGradient (ASSG) method. The Nyström approach reduces the problem to a linear-like, reduced dimension $m$ with $m \ll n$, facilitated by kernel data embedding that enables linear solvers on the compressed representation. The paper provides convergence guarantees adapted from ASSG (e.g., $\tilde{O}\left(\frac{\log(1/\delta)}{\varepsilon}\right)$ for $L^2$-SVM and $\tilde{O}\left(\log\frac{1}{\delta \varepsilon}\right)$ for $L^1$-SVM) and shows empirically that Snacks achieves competitive training times and accuracy against state-of-the-art solvers on diverse large-scale datasets. This work offers a practical large-scale kernel learning tool and points to future extensions, including L1-SVM, multi-class handling, and sampling-based guarantees for Nyström substitutions.
Abstract
Kernel methods provide a powerful framework for non parametric learning. They are based on kernel functions and allow learning in a rich functional space while applying linear statistical learning tools, such as Ridge Regression or Support Vector Machines. However, standard kernel methods suffer from a quadratic time and memory complexity in the number of data points and thus have limited applications in large-scale learning. In this paper, we propose Snacks, a new large-scale solver for Kernel Support Vector Machines. Specifically, Snacks relies on a Nyström approximation of the kernel matrix and an accelerated variant of the stochastic subgradient method. We demonstrate formally through a detailed empirical evaluation, that it competes with other SVM solvers on a variety of benchmark datasets.
