Table of Contents
Fetching ...

Unveiling the Social Fabric: A Temporal, Nation-Scale Social Network and its Characteristics

Jolien Cremers, Benjamin Kohler, Benjamin Frank Maier, Stine Nymann Eriksen, Johanna Einsiedler, Frederik Kølby Christensen, Sune Lehmann, David Dreyer Lassen, Laust Hvas Mortensen, Andreas Bjerre-Nielsen

Abstract

Social networks shape individuals' lives, influencing everything from career paths to health. This paper presents a registry-based, multi-layer and temporal network of the entire Danish population in the years 2008-2021 (roughly 7.2 mill. individuals). Our network maps the relationships formed through family, households, neighborhoods, colleagues and classmates. We outline key properties of this multiplex network, introducing both an individual-focused perspective as well as a bipartite representation. We show how to aggregate and combine the layers, and how to efficiently compute network measures such as shortest paths in large administrative networks. Our analysis reveals how past connections reappear later in other layers, that the number of relationships aggregated over time reflects the position in the income distribution, and that we can recover canonical shortest path length distributions when appropriately weighting connections. Along with the network data, we release a Python package that uses the bipartite network representation for efficient analysis.

Unveiling the Social Fabric: A Temporal, Nation-Scale Social Network and its Characteristics

Abstract

Social networks shape individuals' lives, influencing everything from career paths to health. This paper presents a registry-based, multi-layer and temporal network of the entire Danish population in the years 2008-2021 (roughly 7.2 mill. individuals). Our network maps the relationships formed through family, households, neighborhoods, colleagues and classmates. We outline key properties of this multiplex network, introducing both an individual-focused perspective as well as a bipartite representation. We show how to aggregate and combine the layers, and how to efficiently compute network measures such as shortest paths in large administrative networks. Our analysis reveals how past connections reappear later in other layers, that the number of relationships aggregated over time reflects the position in the income distribution, and that we can recover canonical shortest path length distributions when appropriately weighting connections. Along with the network data, we release a Python package that uses the bipartite network representation for efficient analysis.
Paper Structure (32 sections, 4 equations, 12 figures)

This paper contains 32 sections, 4 equations, 12 figures.

Figures (12)

  • Figure 1: Size and Temporal Stability Of The Network. Panel A shows the total number of edges per layer for a single year (i.e., 2021) and a ten year time-span for the bipartite and the individual-centered representation of the network (log scale). Panel B depicts the year-over-year share of changing edges, i.e., the number of edges in one year that also exist in the next year relative to the total number of edges. Panel C shows the total number of nodes per layer for a single year and ten year time-span for the individual-centered and the bipartite representation of the network. Note that the family layer does not include bipartite nodes and that there are too little classmate container layers to be visible.
  • Figure 2: Edge Overlap With Varying Time-Span. Share of overlapping edges between reference layer (y-axis) and comparison layer (x-axis). The reference layer is measured for the year 2021 for all the plots. The comparison layer includes a time-span of 0, 5, or 10 years before 2021. The overlapping share is computed as the number of overlapping edges divided by the total number of edges of the reference layer. In Panel A, the comparison layer includes no time-span, in Panel B a five-year time-span, and in Panel C a ten year time-span.
  • Figure 3: Degree Distribution. Panel A shows the kernel density estimation for the degree distribution for single year (i.e., 2021), five and ten year time-span (log scale). Panel B depicts the Lorenz-curve (cumulative share of degrees for lower x-% of nodes) by layer. Panel C shows the degree distribution (log-scale) for stacked layers
  • Figure 4: Inter-layer Degree Correlation. Pearson correlation between node's degree for varying layers and time-span. The time-span applies to the layers on both the x- and y-axis. Panel A shows the correlation with no time-span, Panel B with time-span of five years and Panel C with time-span ten years.
  • Figure 5: Relation Of Individual Degree And Income Or Sex Over The Working Life Cycle. Average degree distribution conditional on age and income (Panel A and B) or age and sex (Panel C and D). The plot is based on 2021 data. Panels A and C contain a single year time-span andPanel B and D a ten year time-span.
  • ...and 7 more figures