Attribute network models, stochastic approximation, and network sampling and ranking algorithms

Nelson Antunes; Sayan Banerjee; Shankar Bhamidi; Vladas Pipiras

Attribute network models, stochastic approximation, and network sampling and ranking algorithms

Nelson Antunes, Sayan Banerjee, Shankar Bhamidi, Vladas Pipiras

TL;DR

This work analyzes attributed evolving networks with preferential attachment modulated by node attributes and a kernel κ, establishing local weak convergence to stopping multi-type branching processes that describe the limiting local geometry. The authors develop a resolvability principle enabling transfer of asymptotics from a tractable U model to the original P model, and derive comprehensive asymptotics for both local (degree tails PageRank) and global functionals (maximal degree). They extend results to non-tree and uniform attachment regimes and apply the theory to network sampling, revealing how PageRank and walk-based sampling can mitigate minority bias in rare attribute settings. The results yield explicit tail exponents for PageRank independent of type, type-dependent degree tails, and detailed sampling bias formulas, with significant implications for fairness and inference in large-scale attributed networks.

Abstract

We analyze dynamic random network models where younger vertices connect to older ones with probabilities proportional to their degrees as well as a propensity kernel governed by their attribute types. Using stochastic approximation techniques we show that, in the large network limit, such networks converge in the local weak sense to limiting infinite random trees with an explicit description in terms of randomly stopped multi-type branching processes. This allows for the derivation of asymptotics for a wide class of network functionals implying, for example, that while degree distribution tail exponents depend on the attribute type (already derived by Jordan (2013)), PageRank centrality scores have the same tail exponent across attributes. The limit results also give explicit formulae for the performance of various network sampling mechanisms. One surprising consequence is the efficacy of PageRank and walk based network sampling schemes for directed networks in the setting of rare minorities.

Attribute network models, stochastic approximation, and network sampling and ranking algorithms

TL;DR

Abstract

Paper Structure (40 sections, 32 theorems, 162 equations, 3 figures)

This paper contains 40 sections, 32 theorems, 162 equations, 3 figures.

Introduction
Motivation for this work and analysis
Organization of the paper
Constructions and basic definitions
The $\mathscrbf{U}$ Model
Local weak convergence for trees
Extended fringe decomposition for marked trees
Convergence on the space of trees
Local weak convergence for directed graphs
Functionals of interest
Main Results: Tree networks
Resolvability
Local weak convergence for finite attribute models
Asymptotics for degree distribution, PageRank scores and homophily measures
Asymptotics for global functionals
...and 25 more sections

Key Result

Lemma 2.2

Assume $\mathcal{S}$ is compact and $\kappa(\cdot, \cdot)$ is bounded. Then the above branching does not explode i.e. $T_n\stackrel{\mathrm{a.s.}}{\longrightarrow} \infty$ as $n\to\infty$. Let $\left\{\mathcal{G}_n:n\geq 0\right\} \equiv \left\{\mathop{\mathrm{BP}}\nolimits(T_n):n\geq 0\right\}$ den

Figures (3)

Figure 2.1: Fringe decomposition around vertex $v$ of a finite tree rooted at $\rho$. Here the blue colors represent roots of the respecitve trees.
Figure 2.2: A sin-tree $\mathcal{T}_\infty$, namely a tree rooted at $0$ with a single infinite path to infinity, and the corresponding extended fringe $F_3(0,\mathcal{T}_\infty)$ upto level three about $0$.
Figure 2.3:

Theorems & Definitions (63)

Definition 1.1: Attributed evolving network model class $\mathscrbf{P}$
Definition 2.1: Network model class $\mathscrbf{U}$
Lemma 2.2
Definition 2.3: Local weak convergence
Definition 2.4: Fringe distribution aldous-fringe
Lemma 2.5
Definition 2.6: Local weak convergence for directed graphs
Definition 2.7: PageRank scores with damping factor $c$
Definition 3.1: Resolvability
Lemma 3.3: jordan2013geometric
...and 53 more

Attribute network models, stochastic approximation, and network sampling and ranking algorithms

TL;DR

Abstract

Attribute network models, stochastic approximation, and network sampling and ranking algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (63)