Table of Contents
Fetching ...

Embeddings of Nation-Level Social Networks

Tanzir Pial, Flavio Hafner, Dakota Handzlik, Enamul Hassan, Lucas Sage, Ana Macanovic, Tom Emery, Arnout van de Rijt, Steven Skiena

Abstract

Full nation-scale social networks are now emerging from countries such as the Netherlands and Denmark, but these networks present challenging technical issues in working with large, multiplex, time-dependent networks. We report on our experiences in producing dynamic node embeddings of the population network of the Netherlands. We present (a) a layer-sensitive random walk strategy which improves on traditional flattening methods for multiplex networks, (b) a temporal alignment strategy that brings annual networks into the same embedding space, without leaking information to future years, and (c) the use of Fibonacci spirals and embedding whitening techniques for more balanced and effective partitioning. We demonstrate the effectiveness of these techniques in building embedding-based models for 13 downstream tasks.

Embeddings of Nation-Level Social Networks

Abstract

Full nation-scale social networks are now emerging from countries such as the Netherlands and Denmark, but these networks present challenging technical issues in working with large, multiplex, time-dependent networks. We report on our experiences in producing dynamic node embeddings of the population network of the Netherlands. We present (a) a layer-sensitive random walk strategy which improves on traditional flattening methods for multiplex networks, (b) a temporal alignment strategy that brings annual networks into the same embedding space, without leaking information to future years, and (c) the use of Fibonacci spirals and embedding whitening techniques for more balanced and effective partitioning. We demonstrate the effectiveness of these techniques in building embedding-based models for 13 downstream tasks.

Paper Structure

This paper contains 15 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Layer-aware embeddings outperform the layer-blind embeddings for all 5 income variables. Here the performance is averaged over two years for each variable.
  • Figure 2: AUC of embeddings vs different binary classification tasks. The grey bars represent AUC of the baseline layer-blind embeddings and the green portions represent the improvement achieved by layer-aware embeddings.
  • Figure 3: Alignment quality over time when mapping embeddings from each source year (2010--2020) to the 2009 target space. We compare Per-dimension Linear Regression (OLS) and Orthogonal Procrustes. Panels show Pearson (left) and Spearman (right) correlations between aligned and target embeddings (higher is better). Performance declines with temporal distance; OLS consistently outperforms Orthogonal Procrustes.
  • Figure 4: Example Fibonacci grids with Voronoi cells in 2D and 3D. On the left, the 2D cells are generated with 10 points on the Fibonacci lattice that divides the circumference into 10 almost equal parts. On the right, 100 points generate the 100 Voronoi clusters on the 3D sphere's surface. Fibonacci grids in higher dimension can generate similar Voronoi cells.
  • Figure 5: (a) CDF of cluster sizes (every fifth cluster marked). Whitening yields near-uniform distribution (near the $y=x$ line) compared to regular embeddings, where less than 5% of the clusters contain more than 50% of the regular embedding points. (b) Fraction of individuals remaining in their 2009 cluster.