Table of Contents
Fetching ...

Universal Rank Inference via Residual Subsampling with Application to Large Networks

Xiao Han, Qing Yang, Yingying Fan

Abstract

Determining the precise rank is an important problem in many large-scale applications with matrix data exploiting low-rank plus noise models. In this paper, we suggest a universal approach to rank inference via residual subsampling (RIRS) for testing and estimating rank in a wide family of models, including many popularly used network models such as the degree corrected mixed membership model as a special case. Our procedure constructs a test statistic via subsampling entries of the residual matrix after extracting the spiked components. The test statistic converges in distribution to the standard normal under the null hypothesis, and diverges to infinity with asymptotic probability one under the alternative hypothesis. The effectiveness of RIRS procedure is justified theoretically, utilizing the asymptotic expansions of eigenvectors and eigenvalues for large random matrices recently developed in [11] and [12]. The advantages of the newly suggested procedure are demonstrated through several simulation and real data examples.

Universal Rank Inference via Residual Subsampling with Application to Large Networks

Abstract

Determining the precise rank is an important problem in many large-scale applications with matrix data exploiting low-rank plus noise models. In this paper, we suggest a universal approach to rank inference via residual subsampling (RIRS) for testing and estimating rank in a wide family of models, including many popularly used network models such as the degree corrected mixed membership model as a special case. Our procedure constructs a test statistic via subsampling entries of the residual matrix after extracting the spiked components. The test statistic converges in distribution to the standard normal under the null hypothesis, and diverges to infinity with asymptotic probability one under the alternative hypothesis. The effectiveness of RIRS procedure is justified theoretically, utilizing the asymptotic expansions of eigenvectors and eigenvalues for large random matrices recently developed in [11] and [12]. The advantages of the newly suggested procedure are demonstrated through several simulation and real data examples.

Paper Structure

This paper contains 22 sections, 8 theorems, 88 equations, 2 figures, 6 tables.

Key Result

Theorem 3.1

Assume Conditions cond1-cond6. Under null hypothesis in eq:hypothesis we have

Figures (2)

  • Figure 1: Histogram plots and the estimated densities (red curves) of RIRS test statistic when $K=2$ and $\rho=0.7$. Left: $T_n$ when no selfloop; Right: $\widetilde{T}_n$ when selfloops exist.
  • Figure 2: DCMM. Histogram plots and the estimated densities (red curves) of RIRS when $K=3$ and $n=1500$. Left: $T_n$ when no selfloop; Right: $\widetilde{T}_n$ when selfloops exist.

Theorems & Definitions (13)

  • Theorem 3.1
  • Theorem 3.2
  • Corollary 1
  • Example 1
  • Theorem 3.3
  • Theorem 3.4
  • Corollary 2
  • Remark 1
  • Remark 2
  • Theorem 3.5
  • ...and 3 more