Table of Contents
Fetching ...

Leave-one-out Singular Subspace Perturbation Analysis for Spectral Clustering

Anderson Y. Zhang, Harrison H. Zhou

TL;DR

This work develops a leave-one-out perturbation framework for singular subspaces to obtain sharper, entrywise guarantees for spectral clustering under mixture models. By exploiting the independence between leave-one-out estimators and a given observation, the authors derive exponential misclustering bounds for sub-Gaussian mixtures and, in the isotropic Gaussian case, minimax-optimal rates with exact recovery thresholds. The approach improves upon classical perturbation results such as Wedin's theorem by accounting for the interaction between projected signal and residual, and it supports adaptive dimension reduction and broader extensions to eigenspaces and high-dimensional regimes. Collectively, the results advance theoretical understanding of spectral clustering under realistic noise models and provide practically relevant, provable guarantees with explicit constants.

Abstract

The singular subspaces perturbation theory is of fundamental importance in probability and statistics. It has various applications across different fields. We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one and establish a novel perturbation upper bound for the distance between the two corresponding singular subspaces. It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem. Empowered by this leave-one-out perturbation theory, we provide a deterministic entrywise analysis for the performance of spectral clustering under mixture models. Our analysis leads to an explicit exponential error rate for spectral clustering of sub-Gaussian mixture models. For the mixture of isotropic Gaussians, the rate is optimal under a weaker signal-to-noise condition than that of L{ö}ffler et al. (2021).

Leave-one-out Singular Subspace Perturbation Analysis for Spectral Clustering

TL;DR

This work develops a leave-one-out perturbation framework for singular subspaces to obtain sharper, entrywise guarantees for spectral clustering under mixture models. By exploiting the independence between leave-one-out estimators and a given observation, the authors derive exponential misclustering bounds for sub-Gaussian mixtures and, in the isotropic Gaussian case, minimax-optimal rates with exact recovery thresholds. The approach improves upon classical perturbation results such as Wedin's theorem by accounting for the interaction between projected signal and residual, and it supports adaptive dimension reduction and broader extensions to eigenspaces and high-dimensional regimes. Collectively, the results advance theoretical understanding of spectral clustering under realistic noise models and provide practically relevant, provable guarantees with explicit constants.

Abstract

The singular subspaces perturbation theory is of fundamental importance in probability and statistics. It has various applications across different fields. We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one and establish a novel perturbation upper bound for the distance between the two corresponding singular subspaces. It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem. Empowered by this leave-one-out perturbation theory, we provide a deterministic entrywise analysis for the performance of spectral clustering under mixture models. Our analysis leads to an explicit exponential error rate for spectral clustering of sub-Gaussian mixture models. For the mixture of isotropic Gaussians, the rate is optimal under a weaker signal-to-noise condition than that of L{ö}ffler et al. (2021).
Paper Structure (31 sections, 20 theorems, 237 equations, 2 algorithms)