Geodesic Length Distribution in Sparse Network Ensembles
Sahil Loomba, Nick S. Jones
TL;DR
This paper derives an analytic geodesic length distribution (GLD) for node pairs in sparse networks by establishing a recursive, probabilistic framework on sparse ensemble average networks (SEANs) and its generalization to sparse general random networks (SGRNs). It defines survival and conditional-PMF matrices, derives closed-form and approximate closed-form GLD expressions, and connects them to an integral-operator framework that yields spectral forms when the kernel is symmetric. The contributions span (i) a rigorous supercritical/subcritical treatment with percolation probabilities, (ii) a unifying closed-form GLD via matrix/operator exponentials and, for symmetric kernels, eigen-decompositions, and (iii) detailed model-specific instantiations for SBM, RDPG, Gaussian RGG, and sparse graphons with illustrative insights and empirical validation. The results enable analytic access to distances, centralities, and connectedness properties in very large, sparse networks and offer practical paths for inference and graph-learning tasks on partially observed data. Overall, the work provides a versatile, theory-grounded toolkit for geodesic statistics in diverse sparse network models with potential impact on network inference, coarsening, and representation learning.
Abstract
A key task in the study of networked systems is to derive local and global properties that impact connectivity, synchronizability, and robustness; computing shortest paths or geodesics yields measures of network connectivity that can explain such phenomena. We derive an analytic distribution of geodesic lengths on the giant component in the supercritical regime -- when the giant component exists -- or on small components in the subcritical regime, of any sparse (and possibly directed) network with conditionally independent edges, in the infinite-size limit. We provide specific results for widely used network models like stochastic block models, dot product graphs, random geometric graphs, and sparse graphons. The survival function of the geodesic length distribution possesses a simple closed-form expression which is asymptotically tight for finite lengths, has a natural interpretation of traversing independent geodesics in the network, and delivers novel insight into the aforementioned network families.
