Table of Contents
Fetching ...

Hypothesis testing on invariant subspaces of non-diagonalizable matrices with applications to network statistics

Jérôme R. Simons

TL;DR

This paper extends inference for eigenvectors to invariant and singular subspaces of non-symmetric, potentially non-diagonalizable matrices, enabling principled hypothesis testing in directed networks. It develops Wald tests for invariant and singular subspaces and practical $t$-tests for individual coefficients, underpinned by higher-order perturbation theory and a smooth resolvent-based map that yields arbitrary-order expansions. The framework applies to network statistics that depend on eigenvectors of estimated adjacency matrices, providing convergence rates and standard errors for centralities such as eigenvector, PageRank, Katz, and diffusion centralities, and demonstrating how uncertainty in links can alter node rankings. Through simulations and empirical applications (trade networks, digraphs), the work shows reliable finite-sample performance and highlights when uncertainty changes empirical conclusions, offering a robust toolkit for uncertainty quantification in spectral network analysis.

Abstract

We generalise the inference procedure for eigenvectors of symmetrizable matrices of Tyler (1981) to that of invariant and singular subspaces of non-diagonalizable matrices. Wald tests for invariant vectors and $t$-tests for their individual coefficients perform well in simulations, despite the matrix being not symmetric. Using these results, it is now possible to perform inference on network statistics that depend on eigenvectors of non-symmetric adjacency matrices as they arise in empirical applications from directed networks. Further, we find that statisticians only need control over the first-order Davis-Kahan bound to control convergence rates of invariant subspace estimators to higher-orders. For general invariant subspaces, the minimal eigenvalue separation dominates the first-order bound potentially slowing convergence rates considerably. In an example, we find that accounting for uncertainty in network estimates changes empirical conclusions about the ranking of nodes' popularity.

Hypothesis testing on invariant subspaces of non-diagonalizable matrices with applications to network statistics

TL;DR

This paper extends inference for eigenvectors to invariant and singular subspaces of non-symmetric, potentially non-diagonalizable matrices, enabling principled hypothesis testing in directed networks. It develops Wald tests for invariant and singular subspaces and practical -tests for individual coefficients, underpinned by higher-order perturbation theory and a smooth resolvent-based map that yields arbitrary-order expansions. The framework applies to network statistics that depend on eigenvectors of estimated adjacency matrices, providing convergence rates and standard errors for centralities such as eigenvector, PageRank, Katz, and diffusion centralities, and demonstrating how uncertainty in links can alter node rankings. Through simulations and empirical applications (trade networks, digraphs), the work shows reliable finite-sample performance and highlights when uncertainty changes empirical conclusions, offering a robust toolkit for uncertainty quantification in spectral network analysis.

Abstract

We generalise the inference procedure for eigenvectors of symmetrizable matrices of Tyler (1981) to that of invariant and singular subspaces of non-diagonalizable matrices. Wald tests for invariant vectors and -tests for their individual coefficients perform well in simulations, despite the matrix being not symmetric. Using these results, it is now possible to perform inference on network statistics that depend on eigenvectors of non-symmetric adjacency matrices as they arise in empirical applications from directed networks. Further, we find that statisticians only need control over the first-order Davis-Kahan bound to control convergence rates of invariant subspace estimators to higher-orders. For general invariant subspaces, the minimal eigenvalue separation dominates the first-order bound potentially slowing convergence rates considerably. In an example, we find that accounting for uncertainty in network estimates changes empirical conclusions about the ranking of nodes' popularity.
Paper Structure (34 sections, 30 theorems, 143 equations, 5 figures, 1 table, 2 algorithms)

This paper contains 34 sections, 30 theorems, 143 equations, 5 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Suppose assu:general-ass holds. Then, $\hat{W}_{n}\left(\upsilon_{\perp}\right)\rightsquigarrow\chi_{qm}^{2}.$

Figures (5)

  • Figure 6.1: Confidence intervals ($95\%$) for eigenvector centralities of (a) trade network (one-sided) and (b) input-output network (two-sided).
  • Figure 6.2: Estimated networks from (a) trade data and (b) input-output of sectors of the US economy. Arrow thickness indicates trade volume while node size indicates estimated centrality score.
  • Figure 6.3: Example graph measured with noise and quality of the asymptotic approximation for inference on eigenvector centralities. $1000$ MC repetitions were used for a sample size of 500. Q-Q plots are theoretical ($y$) vs. empirical ($x$).
  • Figure 6.4: Quality of asymptotic approximation. Left column: Wald test, right column: t-test. $5000$ MC repetitions were used for a sample size of $100$. : Q-Q plots are theoretical ($y$) vs. empirical ($x$).
  • Figure 6.5: Quality of asymptotic approximation for Wald test-based inference on singular vectors. $2000$ MC repetitions were used for a sample size of $500$. : Q-Q plots are theoretical ($y$) vs. empirical ($x$).

Theorems & Definitions (60)

  • Example 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Corollary 1: Folded normal distribution
  • Theorem 4: Distribution of basis vectors
  • Lemma 1
  • Theorem 5
  • Lemma 2: Jacobians
  • Theorem 6: Higher-order Davis-Kahan bounds
  • ...and 50 more