Table of Contents
Fetching ...

NetEffect: Discovery and Exploitation of Generalized Network Effects

Meng-Chieh Lee, Shubhranshu Shekhar, Jaemin Yoo, Christos Faloutsos

TL;DR

This work tackles the problem of determining whether generalized network effects ($GNE$) exist in large graphs with only a few labeled nodes and how to exploit them for downstream node classification. It introduces a three-part framework: NetEffect_Test for principled statistical detection of GNE, NetEffect_Est for explainable, closed-form estimation of the class-compatibility matrix, and NetEffect_Exp for accurate, scalable exploitation using an emphasis-aware propagation. The authors show that many real-world heterophily graphs lack meaningful GNE, while others exhibit heterophily or x-ophily with weak or strong GNE, and that leveraging GNE improves classification accuracy while achieving substantial speedups on million-scale graphs. The approach provides interpretable insights through the estimated compatibility matrices and demonstrates practical impact with linear-time scalability and reproducibility on large datasets.

Abstract

Given a large graph with few node labels, how can we (a) identify whether there is generalized network-effects (GNE) or not, (b) estimate GNE to explain the interrelations among node classes, and (c) exploit GNE efficiently to improve the performance on downstream tasks? The knowledge of GNE is valuable for various tasks like node classification, and targeted advertising. However, identifying GNE such as homophily, heterophily or their combination is challenging in real-world graphs due to limited availability of node labels and noisy edges. We propose NetEffect, a graph mining approach to address the above issues, enjoying the following properties: (i) Principled: a statistical test to determine the presence of GNE in a graph with few node labels; (ii) General and Explainable: a closed-form solution to estimate the specific type of GNE observed; and (iii) Accurate and Scalable: the integration of GNE for accurate and fast node classification. Applied on real-world graphs, NetEffect discovers the unexpected absence of GNE in numerous graphs, which were recognized to exhibit heterophily. Further, we show that incorporating GNE is effective on node classification. On a million-scale real-world graph, NetEffect achieves over 7 times speedup (14 minutes vs. 2 hours) compared to most competitors.

NetEffect: Discovery and Exploitation of Generalized Network Effects

TL;DR

This work tackles the problem of determining whether generalized network effects () exist in large graphs with only a few labeled nodes and how to exploit them for downstream node classification. It introduces a three-part framework: NetEffect_Test for principled statistical detection of GNE, NetEffect_Est for explainable, closed-form estimation of the class-compatibility matrix, and NetEffect_Exp for accurate, scalable exploitation using an emphasis-aware propagation. The authors show that many real-world heterophily graphs lack meaningful GNE, while others exhibit heterophily or x-ophily with weak or strong GNE, and that leveraging GNE improves classification accuracy while achieving substantial speedups on million-scale graphs. The approach provides interpretable insights through the estimated compatibility matrices and demonstrates practical impact with linear-time scalability and reproducibility on large datasets.

Abstract

Given a large graph with few node labels, how can we (a) identify whether there is generalized network-effects (GNE) or not, (b) estimate GNE to explain the interrelations among node classes, and (c) exploit GNE efficiently to improve the performance on downstream tasks? The knowledge of GNE is valuable for various tasks like node classification, and targeted advertising. However, identifying GNE such as homophily, heterophily or their combination is challenging in real-world graphs due to limited availability of node labels and noisy edges. We propose NetEffect, a graph mining approach to address the above issues, enjoying the following properties: (i) Principled: a statistical test to determine the presence of GNE in a graph with few node labels; (ii) General and Explainable: a closed-form solution to estimate the specific type of GNE observed; and (iii) Accurate and Scalable: the integration of GNE for accurate and fast node classification. Applied on real-world graphs, NetEffect discovers the unexpected absence of GNE in numerous graphs, which were recognized to exhibit heterophily. Further, we show that incorporating GNE is effective on node classification. On a million-scale real-world graph, NetEffect achieves over 7 times speedup (14 minutes vs. 2 hours) compared to most competitors.
Paper Structure (31 sections, 5 theorems, 10 equations, 7 figures, 6 tables, 4 algorithms)

This paper contains 31 sections, 5 theorems, 10 equations, 7 figures, 6 tables, 4 algorithms.

Key Result

lemma 1

Given adjacency matrix ${\boldsymbol A}$ and initial beliefs $\hat{{\boldsymbol E}}$, the closed-form solution of vectorized compatibility matrix $\text{vec}{(\hat{{\boldsymbol H}})}$ is: where ${\boldsymbol X} = {\boldsymbol I}_{c \times c} \otimes ({\boldsymbol A}\hat{{\boldsymbol E}})$ and ${\boldsymbol y} = \text{vec}{(\hat{{\boldsymbol E}})}$.

Figures (7)

  • Figure 1: NetEffect works well, thanks to its three novel contributions: (a) NetEffect_ Test statistically the existence of GNE. (b) NetEffect_ Est explains the graph with the x-ophily compatibility matrix. (c) NetEffect_ Exp wins and is fast.
  • Figure 1: NetEffect matches all specs, while baselines miss one or more. '?' and 'N/A' denote unclear and not applicable.
  • Figure 2: NetEffect_ Test works: It discovers that real-world heterophily graphs do not necessarily have GNE. For each graph, we report the edge counting on the left (not available in practice), and the $p$-value table output from NetEffect_ Test on the right, where "P" denotes the presence of GNE, and "F" denotes the absence of GNE.
  • Figure 3: NetEffect_ Est handles imbalancedcase well. Labels of class $1$ is upsampled.
  • Figure 4: Emphasis matrix at work: it prefers well-connected neighbors.
  • ...and 2 more figures

Theorems & Definitions (16)

  • definition 1
  • definition 2
  • definition 3: Mutually indistinguishable
  • lemma 1: Network Effect Formula (NEF)
  • proof
  • lemma 2: Convergence of Random Walks
  • lemma 3: Convergence of Non-Backtracking Random Walks
  • proof
  • lemma 4: Exact Convergence
  • proof
  • ...and 6 more