A tensor factorization model of multilayer network interdependence

Izabel Aguiar; Dane Taylor; Johan Ugander

A tensor factorization model of multilayer network interdependence

Izabel Aguiar, Dane Taylor, Johan Ugander

TL;DR

This work develops a KL-divergence–minimizing nonnegative Tucker decomposition (NNTuck) to model multilayer networks as a Poisson-generative latent-factor framework, generalizing stochastic block models to multiple layers. By tying the KL objective to the Poisson log-likelihood, the authors show that EM updates are step-by-step equivalent to tensorial multiplicative updates, extending known matrix results to tensors. They propose likelihood-ratio tests (and split-LRTs) to define and detect layer independence, dependence, and redundance, and introduce cross-validation tasks (independent and tubular link prediction) for model selection. Across synthetic and real networks (e.g., Krackhardt CSS, Malaria, and village networks), the approach identifies cases of layer independence, dependence, or redundancy, with practical implications for survey design and data collection. The framework thus provides a principled, interpretable way to study interlayer structure and to smooth noisy multilayer data via latent layer representations.

Abstract

Multilayer networks describe the rich ways in which nodes are related by accounting for different relationships in separate layers. These multiple relationships are naturally represented by an adjacency tensor. In this work we study the use of the nonnegative Tucker decomposition (NNTuck) of such tensors under a KL loss as an expressive factor model that naturally generalizes existing stochastic block models of multilayer networks. Quantifying interdependencies between layers can identify redundancies in the structure of a network, indicate relationships between disparate layers, and potentially inform survey instruments for collecting social network data. We propose definitions of layer independence, dependence, and redundancy based on likelihood ratio tests between nested nonnegative Tucker decompositions. Using both synthetic and real-world data, we evaluate the use and interpretation of the NNTuck as a model of multilayer networks. Algorithmically, we show that using expectation maximization (EM) to maximize the log-likelihood under the NNTuck is step-by-step equivalent to tensorial multiplicative updates for the NNTuck under a KL loss, extending a previously known equivalence from nonnegative matrices to nonnegative tensors.

A tensor factorization model of multilayer network interdependence

TL;DR

Abstract

Paper Structure (44 sections, 1 theorem, 58 equations, 12 figures, 2 tables)

This paper contains 44 sections, 1 theorem, 58 equations, 12 figures, 2 tables.

Introduction
Background
Stochastic Block Model (SBM)
Nonnegative Matrix Factorization (NMF)
Tensor Notation and Tucker Decomposition
Frontal slices
Tensor fibers
Unfoldings
The tensor $n$-mode product ($\times_n$)
Tucker decomposition
Multilayer Networks
Related Work
Tensor methods for multilayer networks
Stochastic block models for multilayer networks
Layer interdependence
...and 29 more sections

Key Result

Proposition 1

Determining factor matrices $\bm{U}, \bm{V}$, and $\bm{Y}$ and the core tensor $\boldsymbol{\mathcal{G}}$ in the ntd by maximizing the log-likelihood using em eq:EMupdates is equivalent to using the multiplicative updates eq:KimChoiUpdates to minimize KL-divergence.

Figures (12)

Figure 1: Multilayer networks account for the reality and variety of ways in which nodes interact in a system. In this example, a social network is complexly defined by three different types of social interaction and is represented by a tensor with three frontal slices. In this example, the process generating the "spend time with" layer is a linear combination of those processes generating the "family" and "friends" layers. On the right, we see a visual representation of the ntd of this network and how the third factor matrix accounts for these linearly dependent layers.
Figure 2: We reproduce the results of the methods for interpreting $\hat{\bm{Y}}$ in the ntd of the first and second synthetic network described above as well as for the 48th village from banerjee2019 (labelled "Gossip village 48" in Figure \ref{['fig:allauc']}). For the synthetic networks, $\bm{Y}$ is the true factor matrix from which the network was generated. For all three, $\hat{\bm{Y}}$ has been estimated from the ntd with the highest log-likelihood over 20 runs with different random initializations, $\hat{\bm{Y}}^{(1)}$ has been normalized so that the entries of each row sum to one, $\hat{\bm{Y}}^{(2)}$ has been normalized so that each row has unit 2-norm. For the synthetic networks, $\hat{\bm{Y}}^*$ is the resulting factor matrix after rewriting $\boldsymbol{\mathcal{G}}$ in the basis of layers 1 and 3 (a process which is described in detail in \ref{['SM:masking']}). Note that in the synthetic examples, all methods for interpreting $\hat{\bm{Y}}$, including simply inspecting $\hat{\bm{Y}}$, accurately represent how the layers of the network are related to one another. Specifically, note how $\hat{\bm{Y}}^*$ almost exactly recovers the ground truth of how the layers are interdependent. Focusing on $\hat{\bm{Y}}^*$ matrix for the gossip village, the two reference layers chosen are "Who asks you for advice?" and "Who are your relatives?", where the remaining layers can be understood in terms of a linear combination of these.
Figure 3: ntd performance on independent (left) and tubular (right) link prediction tasks with varying latent dimensions $K$ and $C$ for Krackhardt's css multilayer network. Whereas the layer dependent ntd with $C< L$ has a higher test-AUC in the independent task, the layer independent ntd generally performs as well as the other models in the tubular task. Recall that in this figure, as well as in \ref{['fig:mal_pred']} and \ref{['fig:bvil_link']}, the layer independent ntd is equivalent to mt deBacco.
Figure 4: The test-AUC from the independent and tubular link prediction tasks in the Malaria multilayer network. The layer independent ntd always results in a higher test-AUC than when allowing for layer dependence or layer redundance.
Figure 5: The test-AUC from the independent and tubular link prediction tasks for the Village 0 multilayer network. In both tasks, both the layer dependent and layer redundant ntds perform just as well as the layer independent ntd in terms of test-AUC.
...and 7 more figures

Theorems & Definitions (7)

Definition 1: Layer independence
Definition 2: Layer dependence
Definition 3: Layer redundance
Proposition 1
Proof 1
Proof 2
Proof 3

A tensor factorization model of multilayer network interdependence

TL;DR

Abstract

A tensor factorization model of multilayer network interdependence

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (7)