Rates of convergence and normal approximations for estimators of local dependence random graph models

Jonathan R. Stewart

Rates of convergence and normal approximations for estimators of local dependence random graph models

Jonathan R. Stewart

TL;DR

These theoretical results are the first to achieve both optimal rates of convergence and non-asymptotic bounds on the error of the multivariate normal approximation for parameter vectors of local dependence random graph models.

Abstract

Local dependence random graph models are a class of block models for network data which allow for dependence among edges under a local dependence assumption defined around the block structure of the network. Since being introduced by Schweinberger and Handcock (2015), research in the statistical network analysis and network science literatures have demonstrated the potential and utility of this class of models. In this work, we provide the first theory for estimation and inference which ensures consistent and valid inference of parameter vectors of local dependence random graph models. This is accomplished by deriving convergence rates of estimation and inference procedures for local dependence random graph models based on a single observation of the graph, allowing both the number of model parameters and the sizes of blocks to tend to infinity. First, we derive non-asymptotic bounds on the $\ell_2$-error of maximum likelihood estimators with convergence rates, outlining conditions under which these rates are minimax optimal. Second, and more importantly, we derive non-asymptotic bounds on the error of the multivariate normal approximation. These theoretical results are the first to achieve both optimal rates of convergence and non-asymptotic bounds on the error of the multivariate normal approximation for parameter vectors of local dependence random graph models.

Rates of convergence and normal approximations for estimators of local dependence random graph models

TL;DR

Abstract

-error of maximum likelihood estimators with convergence rates, outlining conditions under which these rates are minimax optimal. Second, and more importantly, we derive non-asymptotic bounds on the error of the multivariate normal approximation. These theoretical results are the first to achieve both optimal rates of convergence and non-asymptotic bounds on the error of the multivariate normal approximation for parameter vectors of local dependence random graph models.

Paper Structure (27 sections, 19 theorems, 301 equations, 3 figures)

This paper contains 27 sections, 19 theorems, 301 equations, 3 figures.

Introduction
Local dependence random graph models
Examples of exponential-family local dependence random graph models
Example 1: The stochastic block model
Example 2: Transitivity in local dependence random graphs
Example 3: Incorporating node and block heterogeneity into models
Theoretical guarantees
Preliminaries for exponential families
Convergence rates of maximum likelihood estimators
Upper bounds on the $\ell_2$-error of maximum likelihood estimators
Minimax risk in the $\ell_2$-norm and optimal rates of convergence
Convergence rates of the multivariate normal approximation
Simulation results
Simulation study 1: Convergence rates of maximum likelihood estimators
Simulation study 2: Error of the normal approximation
...and 12 more sections

Key Result

Theorem 2.1

Consider a minimal exponential-family local dependence random graph model satisfying Assumptions a1, a2, a3, and a4 and assume that $p = \dim(\boldsymbol{\theta}^\star_W) \geq \log \, N$ and $q = \dim(\boldsymbol{\theta}^\star_B) \geq \log \, N$. Then there exist constants $C > 0$ and $N_0 \geq 3$, for all integers $N \geq N_0$.

Figures (3)

Figure 1: Three real data examples of networks for which local dependence random graph models would be applicable, including Sampson's monastery network, the school classes data set from Stewart2019, and the Bali terrorist network studied in Schweinberger2015. Node colors correspond to block memberships.
Figure 2: The results of Simulation study 1, which demonstrates the trade-off in finite sample performance of maximum likelihood estimators based on the number of model parameters and size of the network. Each boxplot for each combination of case and network size is based on $500$ replications. Boxplots display the empirical distribution of the $\ell_2$-error, whereas the red lines track the $95\%$ sample quantiles and the blue dashed lines track the error bounds predicted by Theorem \ref{['thm:L2']}.
Figure 3: Quantile-Quantile plots showing the results of Simulation Study 2. The sample quantiles of the standardized maximum likelihood estimates of the transitive edge parameter are plotted against the theoretical quantiles based on the standard normal approximation in each of the two cases studied in Simulation study 2 across networks of size $N \in \{250, 500, 750, 1000\}$.

Theorems & Definitions (23)

Remark 1: Discussion of Assumption \ref{['a1']}
Remark 2: Discussion of Assumption \ref{['a2']}
Remark 3: Discussion of Assumption \ref{['a3']}
Remark 4: Discussion of Assumption \ref{['a4']}
Theorem 2.1
Theorem 2.2
Theorem 2.3
Corollary 2.4
Theorem 2.5
Corollary 2.6
...and 13 more

Rates of convergence and normal approximations for estimators of local dependence random graph models

TL;DR

Abstract

Rates of convergence and normal approximations for estimators of local dependence random graph models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (23)