On inference for modularity statistics in structured networks

Anirban Mitra; Konasale Prasad; Joshua Cape

On inference for modularity statistics in structured networks

Anirban Mitra, Konasale Prasad, Joshua Cape

TL;DR

This article formulate and study several modularity statistic variants for which asymptotic distributional results in the large-network limit for networks exhibiting nodal community structure are established and can be used in conjunction with existing theoretical guarantees for stochastic blockmodel random graphs.

Abstract

This paper revisits the classical concept of network modularity and its spectral relaxations used throughout graph data analysis. We formulate and study several modularity statistic variants for which we establish asymptotic distributional results in the large-network limit for networks exhibiting nodal community structure. Our work facilitates testing for network differences and can be used in conjunction with existing theoretical guarantees for stochastic blockmodel random graphs. Our results are enabled by recent advances in the study of low-rank truncations of large network adjacency matrices. We provide confirmatory simulation studies and real data analysis pertaining to the network neuroscience study of psychosis, specifically schizophrenia. Collectively, this paper contributes to the limited existing literature to date on statistical inference for modularity-based network analysis. Supplemental materials for this article are available online.

On inference for modularity statistics in structured networks

TL;DR

Abstract

Paper Structure (28 sections, 6 theorems, 29 equations, 14 figures, 4 tables)

This paper contains 28 sections, 6 theorems, 29 equations, 14 figures, 4 tables.

Introduction
Background
Random graph models
Basics of network modularity
Modularity-based inference
Modularity variants
Modularity asymptotics
Notational setup
Asymptotic distributions for modularity statistics
Hypothesis testing
Simulations
Example 1: graphs with $K=3$, $d=3$, balanced, assortative
Example 2: graphs with $K=2$, $d=2$, balanced, disassortative
Example 3: graphs with $K=2$, $d=1$, unbalanced, core-periphery
Contour plots and parameter space
...and 13 more sections

Key Result

Theorem 3.1

For $n \ge 1$, let $\mathbf{A}^{(n)} \sim \operatorname{SBM}(\mathbf{B}, \boldsymbol{\pi})$ be a sequence of stochastic blockmodel graphs with sparsity factor $\rho_{n}$ satisfying $n\rho_{n} = \omega(\log n)$. Then, as $n \rightarrow \infty$, the likelihood variant of modularity in like_mod satisfi The matrix $\mathbf{D}$ is specified in D-form and depends on whether $\rho_{n} \equiv 1$ or $\rho_

Figures (14)

Figure 1: Dense networks in \ref{['eg1']} with $n = 300$ nodes. Left plot shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{L}}$, and right plot shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{S}}$. Dashed vertical line shows bias in simulation. Solid vertical line shows population bias. Solid curve shows population density fit.
Figure 2: Sparse networks in \ref{['eg1']} for $n \in \{300,600,1800,6000\}$ nodes. Left panel shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{L}}$, and right panel shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{S}}$. Dashed vertical line shows bias in simulation. Solid vertical line shows population bias. Solid curve shows population density fit.
Figure 3: Dense networks in \ref{['eg2']} with $n \in \{400,800,1000,4000\}$ nodes. Left panel shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{L}}$, and right panel shows $\rho_{n}^{-1/2}n^{-1}Q_{\operatorname{S}}$. Dashed vertical line shows bias in simulation. Solid vertical line shows population bias. Solid curve shows population density fit.
Figure 4: Sparse networks and residual-based modularity in \ref{['eg3']}. Dashed vertical line shows bias in simulation. Solid vertical line shows population bias. Solid curve shows population density fit.
Figure 5: Asymptotic bias and variance plotted as functions of $(p,q)$ where $\mathbf{B} = \left[p^{2}pqpqq^{2}\right]$ and $\boldsymbol{\pi} = [1/4, 3/4]^{\top}$.
...and 9 more figures

Theorems & Definitions (10)

Definition 2.1: Stochastic blockmodel random graphs
Definition 3.1: Likelihood-based, Spectral-based and Residual-based modularities
Theorem 3.1: Limiting distribution for likelihood-based modularity
Theorem 3.2: Limiting distribution for spectral-based modularity
Theorem 3.3: Limiting distribution for residual-based modularity
Definition 3.2: Maximum likelihood estimator and spectral estimator
Lemma A.1: bickel2013asymptotic
Lemma A.2: tang2022asymptotically
Lemma A.3: tang2022asymptotically
proof : Proofs of \ref{['thrm:mod_like', 'thrm:mod_spec', 'thrm:mod_res']}

On inference for modularity statistics in structured networks

TL;DR

Abstract

On inference for modularity statistics in structured networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (10)