Extreme value theory for singular subspace estimation in the matrix denoising model
Junhyung Chang, Joshua Cape
TL;DR
This work develops a novel extreme-value inference framework for singular subspace estimation in the Gaussian matrix denoising model, centering on the maximum row-wise error $\|\widehat{\mathbf{U}}\mathbf{R}_{\mathbf{U}} - \mathbf{U}\|_{2,\infty}$ and its Gumbel limit after appropriate centering and scaling. By combining row-wise perturbation analysis, random matrix theory, and saddle-point approximations, the authors show tail equivalence to a generalized gamma distribution and derive explicit normalizers, enabling a principled, data-driven plug-in test based on de-biased singular values. They develop asymptotic Type I error control and power results for hypothesis tests on singular subspaces, with empirical evidence showing enhanced power for row-localized alternatives compared to Frobenius-norm based tests. The framework demonstrates robustness to certain non-Gaussian noise regimes and provides a practical approach for testing low-rank structure in high-dimensional denoising tasks. Overall, the paper advances extreme-value theory in matrix denoising and offers a powerful, fine-grained testing methodology for subspace structure with potential extensions to heteroskedastic and non-Gaussian settings.
Abstract
This paper studies fine-grained singular subspace estimation in the matrix denoising model where a deterministic low-rank signal matrix is additively perturbed by a stochastic matrix of Gaussian noise. We establish that the maximum Euclidean row norm (i.e., the two-to-infinity norm) of the aligned difference between the leading sample and population singular vectors approaches the Gumbel distribution in the large-matrix limit, under suitable signal-to-noise conditions and after appropriate centering and scaling. We apply our novel asymptotic distributional theory to test hypotheses of low-rank signal structure encoded in the leading singular vectors and their corresponding principal subspace. We provide de-biased estimators for the corresponding nuisance signal singular values and show that our proposed plug-in test statistic has desirable properties. Notably, compared to using the Frobenius norm subspace distance, our test statistic based on the two-to-infinity norm empirically has higher power to detect structured alternatives that differ from the null in only a few matrix entries or rows. Our main results are obtained by a novel synthesis of and technical analysis involving row-wise matrix perturbation analysis, extreme value theory, saddle point approximation methods, and random matrix theory. Our contributions complement the existing literature for matrix denoising focused on minimaxity, mean squared error analysis, unitarily invariant distances between subspaces, component-wise asymptotic distributional theory, and row-wise uniform error bounds. Numerical simulations illustrate our main results and demonstrate the robustness properties of our testing procedure to non-Gaussian noise distributions.
