Estimating the Spectral Moments of the Kernel Integral Operator from Finite Sample Matrices
Chanwoo Chun, SueYeon Chung, Daniel D. Lee
TL;DR
The paper tackles the challenge of inferring the spectrum of the kernel integral operator from finitely sampled measurement matrices, where both inputs and features are limited, by introducing an unbiased, dynamic-programming–based estimator for spectral moments $m(n)=\mathrm{tr}T_k^n$. It shows how to construct unbiased moment estimates from cyclic products of matrix entries, derives a scalable DP algorithm with a provable variance bound, and demonstrates robustness to noise. The authors validate the method on RBF kernels, derive analytic spectra, and successfully reconstruct eigenvalues from moments; they also illustrate the approach's utility in analyzing neural feature representations during training. This unbiased moment-based framework enables more accurate geometric and learning-dynamics insights for kernel methods and wide neural networks, with practical implications for kernel approximation and spectral analysis in large-scale models.
Abstract
Analyzing the structure of sampled features from an input data distribution is challenging when constrained by limited measurements in both the number of inputs and features. Traditional approaches often rely on the eigenvalue spectrum of the sample covariance matrix derived from finite measurement matrices; however, these spectra are sensitive to the size of the measurement matrix, leading to biased insights. In this paper, we introduce a novel algorithm that provides unbiased estimates of the spectral moments of the kernel integral operator in the limit of infinite inputs and features from finitely sampled measurement matrices. Our method, based on dynamic programming, is efficient and capable of estimating the moments of the operator spectrum. We demonstrate the accuracy of our estimator on radial basis function (RBF) kernels, highlighting its consistency with the theoretical spectra. Furthermore, we showcase the practical utility and robustness of our method in understanding the geometry of learned representations in neural networks.
