Table of Contents
Fetching ...

Finite-sample confidence regions for spectral clustering and graph centrality

Chandrasekhar Gokavarapu, Sekhar Babu Gosala, Vamis Pasalapudi, Tarakarama Kapakayala

TL;DR

Finite-sample inference for spectral graph procedures is developed, which isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims.

Abstract

Let a graph be observed through a finite random sampling mechanism. Spectral methods are routinely applied to such graphs, yet their outputs are treated as deterministic objects. This paper develops finite-sample inference for spectral graph procedures. The primary result constructs explicit confidence regions for latent eigenspaces of graph operators under an explicit sampling model. These regions propagate to confidence regions for spectral clustering assignments and for smooth graph centrality functionals. All bounds are nonasymptotic and depend explicitly on the sample size, noise level, and spectral gap. The analysis isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims. Under verifiable gap and concentration conditions, the present framework yields coverage guarantees and certified stability regions. Several corollaries address fairness-constrained post-processing and topological summaries derived from spectral embeddings.

Finite-sample confidence regions for spectral clustering and graph centrality

TL;DR

Finite-sample inference for spectral graph procedures is developed, which isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims.

Abstract

Let a graph be observed through a finite random sampling mechanism. Spectral methods are routinely applied to such graphs, yet their outputs are treated as deterministic objects. This paper develops finite-sample inference for spectral graph procedures. The primary result constructs explicit confidence regions for latent eigenspaces of graph operators under an explicit sampling model. These regions propagate to confidence regions for spectral clustering assignments and for smooth graph centrality functionals. All bounds are nonasymptotic and depend explicitly on the sample size, noise level, and spectral gap. The analysis isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims. Under verifiable gap and concentration conditions, the present framework yields coverage guarantees and certified stability regions. Several corollaries address fairness-constrained post-processing and topological summaries derived from spectral embeddings.
Paper Structure (90 sections, 22 theorems, 141 equations)

This paper contains 90 sections, 22 theorems, 141 equations.

Key Result

Proposition 1.1

Assume the sampling model in §subsec:model and the spectral gap condition in §subsec:main-results. There exists an explicit, data-dependent radius $\widehat{r}_{n,\alpha}$ and an explicit orthogonal-invariant distance $d_{\mathrm{Gr}}(\cdot,\cdot)$ on the Grassmannian such that the random set satisfies the finite-sample coverage property with $\widehat{r}_{n,\alpha}$ given in closed form in term

Theorems & Definitions (65)

  • Proposition 1.1: Primary technical target
  • Theorem 1.2: Confidence region for the latent subspace
  • Theorem 1.3: Confidence region for clustering and centrality
  • Corollary 1.4: Finite-sample coverage under explicit rates
  • Corollary 1.5: Stability region for a downstream statistic
  • Definition 2.1: Conditionally independent edges
  • Definition 2.2: SBM/DCSBM specialization
  • Definition 2.3: RDPG/GRDPG specialization
  • Definition 2.4: Variance proxy for adjacency
  • Lemma 2.5: Matrix concentration template
  • ...and 55 more