On the Asymptotic Convergence of Subgraph Generated Models
Xinchen Xu, Francesca Parise
TL;DR
The paper addresses the problem of inferring features of large networks when only a subgraph-generated random graph model (SUGM) is available, rather than full network data. It introduces two variants, the weighted SUGM (wSUGM) and the unweighted SUGM (uSUGM), and proves that the realized adjacency matrix concentrates around its expectation with high probability as the network grows, using matrix concentration inequalities for the weighted case and matrix Efron–Stein inequalities for the unweighted case. It then extends these concentration results to graph centrality measures (degree, eigenvector, and Katz centrality), showing that normalized centralities converge in probability to their counterparts in the expected network under mild conditions on subgraph-generation probabilities. This provides a practical pathway to predict node importance and other network features from the generating process itself, useful in large-scale or privacy-constrained settings where exact network data are unavailable.
Abstract
We study a family of random graph models - termed subgraph generated models (SUGMs) - initially developed by Chandrasekhar and Jackson in which higher-order structures are explicitly included in the network formation process. We use matrix concentration inequalities to show convergence of the adjacency matrix of networks realized from such SUGMs to the expected adjacency matrix as a function of the network size. We apply this result to study concentration of centrality measures (such as degree, eigenvector, and Katz centrality) in sampled networks to the corresponding centralities in the expected network, thus proving that node importance can be predicted from knowledge of the random graph model without the need of exact network data.
