Table of Contents
Fetching ...

Goodness-of-fit test for multi-layer stochastic block models

Huan Qing

TL;DR

A novel goodness-of-fit test for the popular multi-layer stochastic block model based on a normalized aggregation of layer-wise adjacency matrices is developed, establishing the asymptotic normality of the test statistic using recent advances in random matrix theory.

Abstract

Community detection in multi-layer networks is a fundamental task in complex network analysis across various areas like social, biological, and computer sciences. However, most existing algorithms assume that the number of communities is known in advance, which is usually impractical for real-world multi-layer networks. To address this limitation, we develop a novel goodness-of-fit test for the popular multi-layer stochastic block model based on a normalized aggregation of layer-wise adjacency matrices. Under the null hypothesis that a candidate community count is correct, we establish the asymptotic normality of the test statistic using recent advances in random matrix theory; conversely, we prove its divergence when the model is underfitted. This dual theoretical foundations enable two computationally efficient sequential testing algorithms to consistently determine the number of communities without prior knowledge. Numerical experiments on simulated and real-world multi-layer networks demonstrate the accuracy and efficiency of our approaches in estimating the number of communities.

Goodness-of-fit test for multi-layer stochastic block models

TL;DR

A novel goodness-of-fit test for the popular multi-layer stochastic block model based on a normalized aggregation of layer-wise adjacency matrices is developed, establishing the asymptotic normality of the test statistic using recent advances in random matrix theory.

Abstract

Community detection in multi-layer networks is a fundamental task in complex network analysis across various areas like social, biological, and computer sciences. However, most existing algorithms assume that the number of communities is known in advance, which is usually impractical for real-world multi-layer networks. To address this limitation, we develop a novel goodness-of-fit test for the popular multi-layer stochastic block model based on a normalized aggregation of layer-wise adjacency matrices. Under the null hypothesis that a candidate community count is correct, we establish the asymptotic normality of the test statistic using recent advances in random matrix theory; conversely, we prove its divergence when the model is underfitted. This dual theoretical foundations enable two computationally efficient sequential testing algorithms to consistently determine the number of communities without prior knowledge. Numerical experiments on simulated and real-world multi-layer networks demonstrate the accuracy and efficiency of our approaches in estimating the number of communities.

Paper Structure

This paper contains 22 sections, 15 theorems, 209 equations, 3 figures, 5 tables, 3 algorithms.

Key Result

Lemma 1

$\widetilde{A}^{\text{ideal}}$ satisfies:

Figures (3)

  • Figure 1: Histogram plots of $T$ for different choices of $(K,n)$, where the red curve is the probability density function of $N(0,1)$.
  • Figure 2: $|T|$ against increasing $K_{0}$ for the real-world networks used in this paper.
  • Figure 3: $\eta_{K_{0}}$ against increasing $K_{0}$ for the real-world networks used in this paper, with the largest $\eta_{K_{0}}$ value highlighted by a larger dot.

Theorems & Definitions (33)

  • Definition 1: Multi-layer stochastic block model (multi-layer SBM)
  • Lemma 1
  • Lemma 2
  • Remark 1
  • Lemma 3
  • Theorem 1: Asymptotic null distribution
  • Theorem 2: Asymptotic power guarantee
  • Theorem 3: Consistency of NAST
  • Theorem 4: Behavior of the ratio statistic
  • Theorem 5: Consistency of SR-NAST
  • ...and 23 more