Modeling diffusion in networks with communities: a multitype branching process approach

Alina Dubovskaya; Caroline B. Pena; David J. P. O'Sullivan

Modeling diffusion in networks with communities: a multitype branching process approach

Alina Dubovskaya, Caroline B. Pena, David J. P. O'Sullivan

TL;DR

Diffusion on networks with community structure is analyzed via a multitype branching process combined with probability-generating functions to obtain distributional properties of cascades. The method handles a simple contagion under the Independent Cascade Model and yields closed-form expressions for extinction probabilities $q(t)$, hazard functions $h(t)$, and cascade-size distributions for each community and the whole network, including cross-community introduction probabilities. It extends from Poisson SBM to general locally tree-like networks using excess-degree distributions, demonstrated on SBM and log-normal networks, and reveals how initial seeding location affects cascade sizes. The results provide a practical tool for predicting outbreak and diffusion behavior from limited degree-distribution information and offer pathways for future work on directed networks, multiple communities, and data-driven validation.

Abstract

The dynamics of diffusion in complex networks are widely studied to understand how entities, such as information, diseases, or behaviors, spread in an interconnected environment. Complex networks often present community structure, and tools to analyze diffusion processes on networks with communities are needed. In this paper, we develop theoretical tools using multi-type branching processes to model and analyze diffusion processes, following a simple contagion mechanism, across a broad class of networks with community structure. We show how, by using limited information about the network -- the degree distribution within and between communities -- we can calculate standard statistical characteristics of propagation dynamics, such as the extinction probability, hazard function, and cascade size distribution. These properties can be estimated not only for the entire network but also for each community separately. Furthermore, we estimate the probability of spread crossing from one community to another where it is not currently spreading. We demonstrate the accuracy of our framework by applying it to two specific examples: the Stochastic Block Model and a log-normal network with community structure. We show how the initial seeding location affects the observed cascade size distribution on a heavy-tailed network and that our framework accurately captures this effect.

Modeling diffusion in networks with communities: a multitype branching process approach

TL;DR

, hazard functions

, and cascade-size distributions for each community and the whole network, including cross-community introduction probabilities. It extends from Poisson SBM to general locally tree-like networks using excess-degree distributions, demonstrated on SBM and log-normal networks, and reveals how initial seeding location affects cascade sizes. The results provide a practical tool for predicting outbreak and diffusion behavior from limited degree-distribution information and offer pathways for future work on directed networks, multiple communities, and data-driven validation.

Abstract

Paper Structure (17 sections, 66 equations, 12 figures, 2 tables)

This paper contains 17 sections, 66 equations, 12 figures, 2 tables.

Introduction and Background
Conceptual model description of the spreading process on networks with communities
Probability-generating function framework in a Poisson-distributed network
Extinction probability and hazard function
Community specific hazard function
Probability of contagion travel between communities
Cascade size distribution
Extending the pgf framework to a general network with communities
Application of pgf framework to a log-normal network with community structure
The community seeding effect on cascades sizes
Conclusion
Numerical estimation of probabilities from probability generating functions
Recovering probabilities from probability generating functions
Comparing branching process simulations to network simulations
Log-normal network parameter sweep
...and 2 more sections

Figures (12)

Figure 1: a) Schematic illustration of the Independent Cascade Model (ICM) on a community-based network. The network consists of two communities each described by its internal degree distributions and distribution of edges between communities. Here $D_1^{(1)}$ and $D_2^{(2)}$ are random variables for the nodes degree inside community 1 and 2 respectively; $D_2^{(1)}$ and $D_1^{(2)}$ are random variables for a nodes' degree between-community (they are the same in the case of undirected networks considered here). At each time step, active nodes attempt to activate neighbors with probability $\rho$, then become inactive (or "removed"). The spread starts in the first community initially and later propagates to the second community. b) Schematic of the multi-type branching process approximating ICM on a community-based network. We track two types of offspring: "type 1" for active nodes in community 1, and "type 2" for active nodes in community 2. Here $\boldsymbol{N}(t)=(N_1(t),N_2(t))$ tracks the number of active nodes of type 1 ($N_1(t)$) and type 2 ($N_2(t)$) at each time step.
Figure 2: a) Extinction probability for Stochastic Block Model; b) Hazard function for Stochastic Block Model. Here, $\lambda_{\text{in}}=8, \; \lambda_{\text{out}}=2$ and $\rho=0.06$. For numerical simulations, we use branching process simulations with averaging across $5\times 10^{4}$ simulations. The hazard function starts in generation 1 (not 0) as the probability of survival until generation 0 is not defined.
Figure 3: a) Probability of infection spreading to community 2 and the probability of reintroduction back to community 1 for SBM. For numerical simulations, we use branching process simulations where we average across $5\times 10^{5}$ simulations. Here, $\lambda_{\text{in}}=8, \; \lambda_{\text{out}}=2$ and $\rho=0.08$; b) Cascade size distribution for SBM. Here, $\lambda_{\text{in}}=8, \; \lambda_{\text{out}}=2$ and $\rho=0.06$.
Figure 4: Schematic illustrating four types of offspring tracking in the model: a) $s_1^{in}$ represents offspring in community 1 produced by traversing the edge within community 1. It produces $\widetilde{X}^{(1)}_1$ and $X^{(1)}_2$ offspring in the next time step; b) $s_1^{out}$ is offspring in community 1 produced by traversing the edge from community 2, producing $X^{(1)}_1$ and $\widetilde{X}^{(1)}_2$ offspring in the next time step; c) $s_2^{in}$ is offspring in community 2 produced by traversing the edge within community 2. It creates $X^{(2)}_1$ and $\widetilde{X}^{(2)}_2$ offspring in the next time step; d) $s_2^{out}$ is offspring in community 2 produced by traversing the edge from community 1. It creates $\widetilde{X}^{(2)}_1$ and $X^{(2)}_2$ offspring in the next time step. The red arrow represents the edge through which the node was infected. This edge cannot be used again, so we use the excess degree distribution for the communities with a traversed edge. Black, blue and yellow arrows show the ways that infection can proceed in the next generation.
Figure 5: a) Degree distribution for our example log-normal network. Nodes in community 1 have an average degree of $\approx 3$, nodes in community 2 have an average degree of $\approx 7$ and the average degree for a randomly selected node on the network is $\approx 5$. These degree distributions also include the cross community edges; b) Degree distribution for our example log-normal network where the x axis is on a log scale; c) Extinction probability for log-normal network. Here, $\rho = 0.16$.
...and 7 more figures

Modeling diffusion in networks with communities: a multitype branching process approach

TL;DR

Abstract

Modeling diffusion in networks with communities: a multitype branching process approach

Authors

TL;DR

Abstract

Table of Contents

Figures (12)