Table of Contents
Fetching ...

Explosive Growth in Large-Scale Collaboration Networks

Peter Williams, Zhan Chen

TL;DR

Two long-span collaboration networks, MAG (1800–2020) and IMDb (1900–2020), reveal persistent super-linear growth with $N(t) ∝ t^{α}$ (MAG: $α_1 = 2.3$, $α_2 = 3.1$; IMDb: $α ≈ 1.8$). Node and edge processes are tightly coupled, maintaining $τ_N/τ_E$ in the range $2.3$–$2.8$ despite rapid expansion, and waiting times between collaborations follow scale-free laws with evolving exponents $γ$ ($MAG: 2.3→1.6$, IMDb: 2.6→2.1). Waiting-time patterns persist across centuries while collaboration sizes diverge by domain (academic growth from 1.2 to 5.8 authors/paper; entertainment remains near 3.2–4.5 cast members). External historical events exert stronger influence on node-entry dynamics than on edge formation, highlighting environmental coupling and challenging assumptions of timescale separation and closed-system growth, thus motivating new, empirically grounded theories for long-term network evolution.

Abstract

We analyse the evolution of two large collaboration networks: the Microsoft Academic Graph (1800-2020) and Internet Movie Database (1900-2020), comprising $2.72 \times 10^8$ and $1.88 \times 10^6$ nodes respectively. The networks show super-linear growth, with node counts following power laws $N(t) \propto t^α$ where $α= 2.3$ increasing to $3.1$ after 1950 (MAG) and $α= 1.8$ (IMDb). Node and edge processes maintain stable but noisy timescale ratios ($τ_N/τ_E \approx 2.8 \pm 0.3$ MAG, $2.3 \pm 0.2$ IMDb). The probability of waiting a time $t$ between successive collaborations was found to be scale-free, $P(t) \propto t^{-γ}$, with indices evolving from $γ\approx 2.3$ to $1.6$ (MAG) and $2.6$ to $2.1$ (IMDb). Academic collaboration sizes increased from $1.2$ to $5.8$ authors per paper, while entertainment collaborations remained more stable ($3.2$ to $4.5$ actors). These observations indicate that current network models might be enhanced by considering accelerating growth, coupled timescales, and environmental influence, while explaining stable local properties.

Explosive Growth in Large-Scale Collaboration Networks

TL;DR

Two long-span collaboration networks, MAG (1800–2020) and IMDb (1900–2020), reveal persistent super-linear growth with (MAG: , ; IMDb: ). Node and edge processes are tightly coupled, maintaining in the range despite rapid expansion, and waiting times between collaborations follow scale-free laws with evolving exponents (, IMDb: 2.6→2.1). Waiting-time patterns persist across centuries while collaboration sizes diverge by domain (academic growth from 1.2 to 5.8 authors/paper; entertainment remains near 3.2–4.5 cast members). External historical events exert stronger influence on node-entry dynamics than on edge formation, highlighting environmental coupling and challenging assumptions of timescale separation and closed-system growth, thus motivating new, empirically grounded theories for long-term network evolution.

Abstract

We analyse the evolution of two large collaboration networks: the Microsoft Academic Graph (1800-2020) and Internet Movie Database (1900-2020), comprising and nodes respectively. The networks show super-linear growth, with node counts following power laws where increasing to after 1950 (MAG) and (IMDb). Node and edge processes maintain stable but noisy timescale ratios ( MAG, IMDb). The probability of waiting a time between successive collaborations was found to be scale-free, , with indices evolving from to (MAG) and to (IMDb). Academic collaboration sizes increased from to authors per paper, while entertainment collaborations remained more stable ( to actors). These observations indicate that current network models might be enhanced by considering accelerating growth, coupled timescales, and environmental influence, while explaining stable local properties.

Paper Structure

This paper contains 15 sections, 3 equations, 16 figures.

Figures (16)

  • Figure 1: Evolution of node counts in the MAG (top) and IMDb (bottom) networks on logarithmic scales. Left panels show absolute counts: cumulative total nodes (black line), nodes active with new edges that year (red line), and new nodes joining that year (green line). Right panels show the same data normalised by the contemporary world population. Grey bands indicate major historical events: La Belle Epoque (1890-1914), World War I (1914-1918), and World War II (1939-1945). Population data post-1950 uses official UN records; earlier values are linearly interpolated between historical estimates. Note the distinct change in MAG growth rate around 1950 and the different sensitivities to historical events between networks.
  • Figure 2: Characteristic timescales of network processes, shown on logarithmic scales. Top panels: MAG network timescales; Bottom panels: IMDb network timescales. Left panels show node timescales: addition (black) and removal (red). Right panels show edge timescales: addition (black) and removal (red). Timescales were computed as the ratio of total quantity to its rate of change. Note the parallel evolution of timescales within each network despite their different absolute values, and the stability of their ratios over centuries of evolution.
  • Figure 3: The fraction of new participants per year in the MAG (left) and IMDb (right) networks, showing the balance between new entrants and established participants. The academic network shows a gradual decrease in the fraction of new authors over time, while the entertainment network maintains a more stable ratio.
  • Figure 4: Parameter evolution of the power law fits to the edge-addition probability distributions in the MAG and IMDb networks.
  • Figure 5: Evolution of collaboration event counts in the networks. Top row shows MAG network: absolute count of papers (left) and relative fractions by author count (right). Bottom row shows IMDb network: absolute count of movies (left) and relative fractions by lead actor count (right). Both networks show systematic changes in the distribution of collaboration size over time.
  • ...and 11 more figures