Table of Contents
Fetching ...

A comprehensive generalization of the Friendship Paradox to weights and attributes

Anna Evtushenko, Jon Kleinberg

TL;DR

This work unifies the Friendship Paradox and its extensions for undirected graphs with weights and arbitrary node attributes, introducing two main extensions: LEFP (list-based) and SEFP (singular-based). It proves that the original FP gaps satisfy nonnegativity and equality only in regular graphs, and shows analogous results for weighted variants (LWFP, SWFP) and for attribute-based extensions via correlation rules involving $r_{w,a}$ and $r_{\gamma,a}$. The paper provides exact gap formulations, demonstrates reductions to simpler cases, and validates the theory with both simulations (on random graphs with weights and attributes) and real data (Facebook100), highlighting that attribute variation can cause the paradox to fail in about half of random cases. The results yield practical, correlation-driven criteria to assess when attribute-based FP extensions hold, with broad applicability to synthetic networks and real-world weighted networks. Overall, the framework offers a comprehensive, accessible account of the math behind the FP and its basic generalizations, linking theory to data and prior work.

Abstract

The Friendship Paradox is a simple and powerful statement about node degrees in a graph (Feld 1991). However, it only applies to undirected graphs with no edge weights, and the only node characteristic it concerns is degree. Since many social networks are more complex than that, it is useful to generalize this phenomenon, if possible, and a number of papers have proposed different generalizations. Here, we unify these generalizations in a common framework, retaining the focus on undirected graphs and allowing for weighted edges and for numeric node attributes other than degree to be considered, since this extension allows for a clean characterization and links to the original concepts most naturally. While the original Friendship Paradox and the Weighted Friendship Paradox hold for all graphs, considering non-degree attributes actually makes the extensions fail around 50% of the time, given random attribute assignment. We provide simple correlation-based rules to see whether an attribute-based version of the paradox holds. In addition to theory, our simulation and data results show how all the concepts can be applied to synthetic and real networks. Where applicable, we draw connections to prior work to make this an accessible and comprehensive paper that lets one understand the math behind the Friendship Paradox and its basic extensions.

A comprehensive generalization of the Friendship Paradox to weights and attributes

TL;DR

This work unifies the Friendship Paradox and its extensions for undirected graphs with weights and arbitrary node attributes, introducing two main extensions: LEFP (list-based) and SEFP (singular-based). It proves that the original FP gaps satisfy nonnegativity and equality only in regular graphs, and shows analogous results for weighted variants (LWFP, SWFP) and for attribute-based extensions via correlation rules involving and . The paper provides exact gap formulations, demonstrates reductions to simpler cases, and validates the theory with both simulations (on random graphs with weights and attributes) and real data (Facebook100), highlighting that attribute variation can cause the paradox to fail in about half of random cases. The results yield practical, correlation-driven criteria to assess when attribute-based FP extensions hold, with broad applicability to synthetic networks and real-world weighted networks. Overall, the framework offers a comprehensive, accessible account of the math behind the FP and its basic generalizations, linking theory to data and prior work.

Abstract

The Friendship Paradox is a simple and powerful statement about node degrees in a graph (Feld 1991). However, it only applies to undirected graphs with no edge weights, and the only node characteristic it concerns is degree. Since many social networks are more complex than that, it is useful to generalize this phenomenon, if possible, and a number of papers have proposed different generalizations. Here, we unify these generalizations in a common framework, retaining the focus on undirected graphs and allowing for weighted edges and for numeric node attributes other than degree to be considered, since this extension allows for a clean characterization and links to the original concepts most naturally. While the original Friendship Paradox and the Weighted Friendship Paradox hold for all graphs, considering non-degree attributes actually makes the extensions fail around 50% of the time, given random attribute assignment. We provide simple correlation-based rules to see whether an attribute-based version of the paradox holds. In addition to theory, our simulation and data results show how all the concepts can be applied to synthetic and real networks. Where applicable, we draw connections to prior work to make this an accessible and comprehensive paper that lets one understand the math behind the Friendship Paradox and its basic extensions.
Paper Structure (32 sections, 34 equations, 4 figures)

This paper contains 32 sections, 34 equations, 4 figures.

Figures (4)

  • Figure 1: This is a weighted undirected graph $G$ on 3 nodes. The weight of each edge, when not 1, is listed on the edge. The nodes are labeled $A$ through $C$. Each node has a numeric attribute $a_i$ associated with it, and it's listed in blue next to each node.
  • Figure 2: The proportion of cases in which each paradox fails resembles a step function with $j$ as the argument---that is why in the top panel we restrict $j$ to $[-3,3]$ so the shift is more apparent. Since positive $j$ implies a positive degree-attribute correlation which, in our construction, is tightly linked to other correlations being positive, and correlation needs to be just slightly greater than or equal to $0$ for an attribute-based version of the paradox to hold, it makes sense that the gaps become positive and the proportion of failure drops to 0 in a step-like fashion. A contributing factor is that given our construction, the standard deviation of the $1000$ correlations for each condition is low ($<0.035$ and highest at $j=0$). See the Supplementary Information for a plot of the standard deviation. (Note also that in the top panel, the minute details of the lines' behavior around 0 are due to randomness which is highest at $j=0$. Here, the lines seemingly all cross around $-0.5$, but that isn't the case in each run of the simulation. We expect the proportion of failure to be $0.5$ at $j=0$ for each line, but it's not guaranteed to be precisely that in a simulation.)
  • Figure 3: We created a standard normal attribute sequence for each of the $1000$$G_{1000,\frac{1}{50}}$ networks and found $r_{d,a}$, $r_{\delta,a}$, $r_{w,a}$ and $r_{\gamma,a}$, and the LAFP, SAFP, LWAFP and SWAFP gap sizes for each. The LAFP gap signs follow the $r_{d,a}$ signs, the SAFP gap signs follow the $r_{\delta,a}$ signs, the LWAFP gap signs follow the $r_{w,a}$ signs, and SWAFP gap signs follow the $r_{\gamma,a}$ signs. Furthermore, the correlation between the $x$-axis quantity and the $y$-axis quantity is $0.9994$ for all four pairs (the exact values are slightly different). But the correlation may not be as high for less symmetric cases such as real-world data. While we aren't interested in gap sizes on their own, it is interesting to see a strong linear relationship between them and their associated correlations in this specific case. Note: for illustration purposes we only plot the results for $100$ networks out of $1000$.
  • Figure 4: Gap sizes and their associated correlations for the Facebook100 data. Like in Figure \ref{['fig:j0']} that looked at random graphs, the LAFP gap signs follow the $r_{d,a}$ signs, the SAFP gap signs follow the $r_{\delta,a}$ signs, the LWAFP gap signs follow the $r_{w,a}$ signs, and SWAFP gap signs follow the $r_{\gamma,a}$ signs. But here the correlation between the $x$-axis quantity and the $y$-axis quantity is lower than $0.9994$ (but still very high): $0.986$ for $r_{d,a}$ and $g_{LAFP}$ (red circles), $0.983$ for $r_{\delta,a}$ and $g_{SAFP}$ (blue triangles), $0.990$ for $r_{w,a}$ and $g_{LWAFP}$ (green circles), $0.986$ for $r_{\gamma,a}$ and $g_{SWAFP}$ (black triangles).