Table of Contents
Fetching ...

Proper network randomization is key to assessing social balance

Bingjie Hao, István A. Kovács

TL;DR

It is shown that even if a network exhibits strong balance by construction, current null models can fail to identify it and it is indicated that matching the signed degree preferences of the nodes is a critical step and so is the preservation of network topology in the null model.

Abstract

Studying significant network patterns, known as graphlets (or motifs), has been a popular approach to understand the underlying organizing principles of complex networks. Statistical significance is routinely assessed by comparing to null models that randomize the connections while preserving some key aspects of the data. However, in signed networks, capturing both positive (friendly) and negative (hostile) relations, the results have been controversial and also at odds with the classical theory of structural balance. We show that this is largely due to the fact that large-scale signed networks exhibit a poor correlation between the number of positive and negative ties of each node. As a solution, here we propose a null model based on the maximum entropy framework that preserves both the signed degrees and the network topology (STP randomization). With STP randomization the results change qualitatively and most social networks consistently satisfy strong structural balance, both at the level of triangles and larger graphlets. We propose a potential underlying mechanism of the observed patterns in signed social networks and outline further applications of STP randomization.

Proper network randomization is key to assessing social balance

TL;DR

It is shown that even if a network exhibits strong balance by construction, current null models can fail to identify it and it is indicated that matching the signed degree preferences of the nodes is a critical step and so is the preservation of network topology in the null model.

Abstract

Studying significant network patterns, known as graphlets (or motifs), has been a popular approach to understand the underlying organizing principles of complex networks. Statistical significance is routinely assessed by comparing to null models that randomize the connections while preserving some key aspects of the data. However, in signed networks, capturing both positive (friendly) and negative (hostile) relations, the results have been controversial and also at odds with the classical theory of structural balance. We show that this is largely due to the fact that large-scale signed networks exhibit a poor correlation between the number of positive and negative ties of each node. As a solution, here we propose a null model based on the maximum entropy framework that preserves both the signed degrees and the network topology (STP randomization). With STP randomization the results change qualitatively and most social networks consistently satisfy strong structural balance, both at the level of triangles and larger graphlets. We propose a potential underlying mechanism of the observed patterns in signed social networks and outline further applications of STP randomization.
Paper Structure (14 sections, 3 equations, 5 figures, 1 table)

This paper contains 14 sections, 3 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Signed degree inconsistency. Positive ($k_{+}$) and negative degrees ($k_{-}$) in social networks are poorly correlated. The dashed black line indicates a perfect correlation between $k_{+}$ and $k_{-}$. The $r$ values denote the Pearson correlation coefficient between $k_{+}$ and $k_{-}$ of each dataset.
  • Figure 2: Overview of signed null models. The original network that contains two groups of nodes (yellow and grey) is shown in the middle. Positive edges are shown in blue, while negatives in red. Thicker lines indicate edges that are different from the original network.
  • Figure 3: Signed graphlets in the Slashdot network compared to different null models. (A) Triangles. (B) Squares. The $\log_2(\mathrm{fold~change})$ is shown on the top accompanied by the grey dashed line indicating a 2-fold increase or decrease. $z$-scores are shown at the bottom, in white if matching SB expectations, and black otherwise. The background of the $z$-scores is blue for positive values and red for negative values.
  • Figure 4: Overview of graphlet significance in the studied networks. The $z$-scores are indicated by blue (overrepresented) and red (underrepresented) blocks. We list the balanced graphlets first, separated from the unbalanced graphlets by a yellow line. We leave the block white if $n_{obs} = \sigma_{rand} =0$ as it leads to an undetermined $z$-score.
  • Figure 5: Illustration of signed copying mechanisms. (A) Signed node copying. When node 4 is added to the existing network, it copies the edges and signs of node 1, forming squares. Note that graphlet 7 ($+-+-{}$) can not be generated this way. (B) Signed edge copying. When node 4 is connected to node 1 by a positive edge, it may copy node 1's edges and signs, forming triangles, squareZs and squareXs. When node 4 is connected to node 1 by a negative edge, it may copy node 1's edges but will reverse the signs. The edge copying mechanism forms balanced triangles, eventually leading to larger balanced graphlets. The initial edges are indicated by solid lines and the copied edges are indicated by dotted lines. The resulting graphlets are indicated by the indices within the green boxes, following the notation of Fig. \ref{['fig:result_summary']} and Fig. S1. The purple arrows point from the node being copied to the node that is copying.