On the Saturation Effect of Kernel Ridge Regression
Yicheng Li, Haobo Zhang, Qian Lin
TL;DR
This work proves a long-standing conjecture about the saturation effect in kernel ridge regression (KRR): when the target function lies in a highly smooth interpolation space $[\mathcal{H}]^{\alpha}$ with $\alpha\ge 2$, the generalization error of KRR cannot decay faster than $n^{-{2}/{(2+\beta)}}$, where $\beta$ characterizes the eigenvalue decay of the kernel. The authors establish this via a bias-variance decomposition, showing $\mathbf{Bias}^2=\Omega(\lambda^2)$ and $\mathbf{Var}=\Omega(\lambda^{-\beta}/n)$, and carefully relating the empirical and population operators to derive the lower bounds. The main result demonstrates a gap between information-theoretic lower bounds and KRR upper bounds for smooth targets, confirming the saturation phenomenon, and is supported by numerical experiments contrasting KRR with gradient flow and other spectral methods. The findings illuminate intrinsic limits of KRR and motivate using non-saturating spectral regularization techniques in settings with very smooth underlying functions.
Abstract
The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation effect has been widely observed in practices and a saturation lower bound of KRR has been conjectured for decades. In this paper, we provide a proof of this long-standing conjecture.
