Finite-sample confidence regions for spectral clustering and graph centrality

Chandrasekhar Gokavarapu; Sekhar Babu Gosala; Vamis Pasalapudi; Tarakarama Kapakayala

Finite-sample confidence regions for spectral clustering and graph centrality

Chandrasekhar Gokavarapu, Sekhar Babu Gosala, Vamis Pasalapudi, Tarakarama Kapakayala

TL;DR

Finite-sample inference for spectral graph procedures is developed, which isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims.

Abstract

Let a graph be observed through a finite random sampling mechanism. Spectral methods are routinely applied to such graphs, yet their outputs are treated as deterministic objects. This paper develops finite-sample inference for spectral graph procedures. The primary result constructs explicit confidence regions for latent eigenspaces of graph operators under an explicit sampling model. These regions propagate to confidence regions for spectral clustering assignments and for smooth graph centrality functionals. All bounds are nonasymptotic and depend explicitly on the sample size, noise level, and spectral gap. The analysis isolates a failure of common practice: asymptotic perturbation arguments are often invoked without a finite-sample spectral gap, leading to invalid uncertainty claims. Under verifiable gap and concentration conditions, the present framework yields coverage guarantees and certified stability regions. Several corollaries address fairness-constrained post-processing and topological summaries derived from spectral embeddings.

Finite-sample confidence regions for spectral clustering and graph centrality

TL;DR

Abstract

Paper Structure (90 sections, 22 theorems, 141 equations)

This paper contains 90 sections, 22 theorems, 141 equations.

Proposition and scope
Primary proposition (spine)
Model class and observables
A concrete failure mode in earlier logic
Main theorems and corollaries
Secondary corollaries (fairness, topology)
Organization
Preliminaries
Notation
Sampling model
Concentration tools
Perturbation tools
Identifiability and spectral gap
What is fixed and what is inferred
Technical Section I: subspace inference
...and 75 more sections

Key Result

Proposition 1.1

Assume the sampling model in §subsec:model and the spectral gap condition in §subsec:main-results. There exists an explicit, data-dependent radius $\widehat{r}_{n,\alpha}$ and an explicit orthogonal-invariant distance $d_{\mathrm{Gr}}(\cdot,\cdot)$ on the Grassmannian such that the random set satisfies the finite-sample coverage property with $\widehat{r}_{n,\alpha}$ given in closed form in term

Theorems & Definitions (65)

Proposition 1.1: Primary technical target
Theorem 1.2: Confidence region for the latent subspace
Theorem 1.3: Confidence region for clustering and centrality
Corollary 1.4: Finite-sample coverage under explicit rates
Corollary 1.5: Stability region for a downstream statistic
Definition 2.1: Conditionally independent edges
Definition 2.2: SBM/DCSBM specialization
Definition 2.3: RDPG/GRDPG specialization
Definition 2.4: Variance proxy for adjacency
Lemma 2.5: Matrix concentration template
...and 55 more

Finite-sample confidence regions for spectral clustering and graph centrality

TL;DR

Abstract

Finite-sample confidence regions for spectral clustering and graph centrality

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (65)