The Optimality of Kernel Classifiers in Sobolev Space
Jianfa Lai, Zhifan Li, Dongming Huang, Qian Lin
TL;DR
This work analyzes binary classification in reproducing kernel Hilbert spaces by studying gradient-flow kernel classifiers trained with spectral algorithms. Under a source condition that the Bayes function lies in an interpolation space $[\mathcal{H}]^s$ and with eigenvalue decay rate $\beta$, it proves an upper bound on the classification excess risk of $O(n^{-s\beta/(2s\beta+2)})$ and establishes a matching minimax lower bound in Sobolev spaces, demonstrating minimax optimality. The results extend to neural-network generalization through the neural tangent kernel and are complemented by a practical smoothness-estimation method that adapts to real datasets. The paper also provides comprehensive appendix-type arguments detailing bounds, embeddings, and the various technical steps needed to support the main theorems, highlighting limitations and extensions to complex function structures.
Abstract
Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $η(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a kernel classifier using recent advances in the theory of kernel regression. We also obtain a minimax lower bound for Sobolev spaces, which shows the optimality of the proposed classifier. Our theoretical results can be extended to the generalization error of overparameterized neural network classifiers. To make our theoretical results more applicable in realistic settings, we also propose a simple method to estimate the interpolation smoothness of $2η(x)-1$ and apply the method to real datasets.
