Table of Contents
Fetching ...

Clique and cycle frequencies in a sparse random graph model with overlapping communities

Tommi Gröhn, Joona Karjalainen, Lasse Leskelä

TL;DR

The paper addresses the problem of quantifying small-subgraph structure in a sparse random graph model with overlapping communities, where the graph is constructed as a superposition of independent layers with layer-size and layer-strength randomness drawn from a joint distribution $\pi_n$. It develops a general, second-moment-based framework to approximate and concentration-bound subgraph frequencies, establishing that the expected counts of a connected subgraph $R$ scale with cross-moments $ (\boldsymbol\pi_n)_{r,s}$ and that clique and cycle counts concentrate around these means under verifiable moment conditions. In particular, for cliques $K_r$ and cycles $C_r$, the paper derives explicit asymptotics: $\mathbb{E}N_{K_r}(G_n)=(1+O(n^{-1}))(r!)^{-1}m_n(\boldsymbol\pi_n)_{r,\binom{r}{2}}$ and $\mathbb{E}N_{C_r}(G_n)=(1+O(n^{-1}))(2r)^{-1}m_n(\boldsymbol\pi_n)_{r,r}$, with high-probability convergence of the empirical frequencies to these expectations under corresponding assumptions. The results enable reliable density estimation from data and illuminate how sparse, overlapping-community graphs differ from Erdős–Rényi baselines, offering practical opportunities for parameter estimation via method-of-moments techniques.

Abstract

A statistical network model with overlapping communities can be generated as a superposition of mutually independent random graphs of varying size. The model is parameterized by the number of nodes, the number of communities, and the joint distribution of the community size and the edge probability. This model admits sparse parameter regimes with power-law limiting degree distributions and non-vanishing clustering coefficients. This article presents large-scale approximations of clique and cycle frequencies for graph samples generated by the model, which are valid for regimes with unbounded numbers of overlapping communities. Our results reveal the growth rates of these subgraph frequencies and show that their theoretical densities can be reliably estimated from data.

Clique and cycle frequencies in a sparse random graph model with overlapping communities

TL;DR

The paper addresses the problem of quantifying small-subgraph structure in a sparse random graph model with overlapping communities, where the graph is constructed as a superposition of independent layers with layer-size and layer-strength randomness drawn from a joint distribution . It develops a general, second-moment-based framework to approximate and concentration-bound subgraph frequencies, establishing that the expected counts of a connected subgraph scale with cross-moments and that clique and cycle counts concentrate around these means under verifiable moment conditions. In particular, for cliques and cycles , the paper derives explicit asymptotics: and , with high-probability convergence of the empirical frequencies to these expectations under corresponding assumptions. The results enable reliable density estimation from data and illuminate how sparse, overlapping-community graphs differ from Erdős–Rényi baselines, offering practical opportunities for parameter estimation via method-of-moments techniques.

Abstract

A statistical network model with overlapping communities can be generated as a superposition of mutually independent random graphs of varying size. The model is parameterized by the number of nodes, the number of communities, and the joint distribution of the community size and the edge probability. This model admits sparse parameter regimes with power-law limiting degree distributions and non-vanishing clustering coefficients. This article presents large-scale approximations of clique and cycle frequencies for graph samples generated by the model, which are valid for regimes with unbounded numbers of overlapping communities. Our results reveal the growth rates of these subgraph frequencies and show that their theoretical densities can be reliably estimated from data.

Paper Structure

This paper contains 14 sections, 14 theorems, 119 equations.

Key Result

Theorem 4.1

Let $R$ be a connected graph with $r$ nodes and $s$ edges, and assume that the layer type distribution satisfies Then the expected number of $R$-isomorphic subgraphs in the model satisfies $\mathbb{E} N_R(G_n) \asymp m_n (\pi_n)_{r,s} \wedge n^r$.

Theorems & Definitions (33)

  • Theorem 4.1
  • Remark
  • Remark
  • Theorem 4.2
  • Remark
  • Theorem 4.4
  • Remark
  • Theorem 4.6
  • Lemma 5.1
  • proof
  • ...and 23 more