Understanding Heterophily for Graph Neural Networks
Junfu Wang, Yuanfang Guo, Liang Yang, Yunhong Wang
TL;DR
This work introduces the Heterophilous Stochastic Block Model (HSBM) to capture diverse heterophily patterns in graphs and analyzes how Graph Convolution (GC) affects multi-class node classification. It shows that GC gains are governed by the Euclidean distance between class-wise neighborhood distributions and the average degree, with neighborhood inconsistency and deeper stacking introducing additional nuanced effects. The authors derive explicit forms for separability under GC and validate the theory through synthetic experiments and eight real-world datasets, highlighting when GC improves or harms performance under different heterophily regimes. Overall, the framework provides principled guidance for designing GNNs on heterophilous graphs and for interpreting GC behavior beyond homophily assumptions.
Abstract
Graphs with heterophily have been regarded as challenging scenarios for Graph Neural Networks (GNNs), where nodes are connected with dissimilar neighbors through various patterns. In this paper, we present theoretical understandings of the impacts of different heterophily patterns for GNNs by incorporating the graph convolution (GC) operations into fully connected networks via the proposed Heterophilous Stochastic Block Models (HSBM), a general random graph model that can accommodate diverse heterophily patterns. Firstly, we show that by applying a GC operation, the separability gains are determined by two factors, i.e., the Euclidean distance of the neighborhood distributions and $\sqrt{\mathbb{E}\left[\operatorname{deg}\right]}$, where $\mathbb{E}\left[\operatorname{deg}\right]$ is the averaged node degree. It reveals that the impact of heterophily on classification needs to be evaluated alongside the averaged node degree. Secondly, we show that the topological noise has a detrimental impact on separability, which is equivalent to degrading $\mathbb{E}\left[\operatorname{deg}\right]$. Finally, when applying multiple GC operations, we show that the separability gains are determined by the normalized distance of the $l$-powered neighborhood distributions. It indicates that the nodes still possess separability as $l$ goes to infinity in a wide range of regimes. Extensive experiments on both synthetic and real-world data verify the effectiveness of our theory.
