Understanding Heterophily for Graph Neural Networks

Junfu Wang; Yuanfang Guo; Liang Yang; Yunhong Wang

Understanding Heterophily for Graph Neural Networks

Junfu Wang, Yuanfang Guo, Liang Yang, Yunhong Wang

TL;DR

This work introduces the Heterophilous Stochastic Block Model (HSBM) to capture diverse heterophily patterns in graphs and analyzes how Graph Convolution (GC) affects multi-class node classification. It shows that GC gains are governed by the Euclidean distance between class-wise neighborhood distributions and the average degree, with neighborhood inconsistency and deeper stacking introducing additional nuanced effects. The authors derive explicit forms for separability under GC and validate the theory through synthetic experiments and eight real-world datasets, highlighting when GC improves or harms performance under different heterophily regimes. Overall, the framework provides principled guidance for designing GNNs on heterophilous graphs and for interpreting GC behavior beyond homophily assumptions.

Abstract

Graphs with heterophily have been regarded as challenging scenarios for Graph Neural Networks (GNNs), where nodes are connected with dissimilar neighbors through various patterns. In this paper, we present theoretical understandings of the impacts of different heterophily patterns for GNNs by incorporating the graph convolution (GC) operations into fully connected networks via the proposed Heterophilous Stochastic Block Models (HSBM), a general random graph model that can accommodate diverse heterophily patterns. Firstly, we show that by applying a GC operation, the separability gains are determined by two factors, i.e., the Euclidean distance of the neighborhood distributions and $\sqrt{\mathbb{E}\left[\operatorname{deg}\right]}$, where $\mathbb{E}\left[\operatorname{deg}\right]$ is the averaged node degree. It reveals that the impact of heterophily on classification needs to be evaluated alongside the averaged node degree. Secondly, we show that the topological noise has a detrimental impact on separability, which is equivalent to degrading $\mathbb{E}\left[\operatorname{deg}\right]$. Finally, when applying multiple GC operations, we show that the separability gains are determined by the normalized distance of the $l$-powered neighborhood distributions. It indicates that the nodes still possess separability as $l$ goes to infinity in a wide range of regimes. Extensive experiments on both synthetic and real-world data verify the effectiveness of our theory.

Understanding Heterophily for Graph Neural Networks

TL;DR

Abstract

, where

is the averaged node degree. It reveals that the impact of heterophily on classification needs to be evaluated alongside the averaged node degree. Secondly, we show that the topological noise has a detrimental impact on separability, which is equivalent to degrading

. Finally, when applying multiple GC operations, we show that the separability gains are determined by the normalized distance of the

-powered neighborhood distributions. It indicates that the nodes still possess separability as

goes to infinity in a wide range of regimes. Extensive experiments on both synthetic and real-world data verify the effectiveness of our theory.

Paper Structure (39 sections, 19 theorems, 139 equations, 20 figures, 3 tables)

This paper contains 39 sections, 19 theorems, 139 equations, 20 figures, 3 tables.

Introduction
Preliminaries
Heterophilous Stochastic Block Models
Theoretical Results
Setting up the Baseline
Impact of Heterophily for Graph Convolution
Impact of Neighborhood Inconsistency
Impact of Stacking Multiple Graph Convolutions
Experiments
Synthetic Data
Real World Data
Conclusion and Future Work
Theoretical Proofs
Proof of \ref{['theorem:1']}
Bayesian Classifier
...and 24 more sections

Key Result

Theorem 1

Given $\left(\boldsymbol{X}, \boldsymbol{A}\right) = {\rm HSBM}\left(n, c, \sigma, \left\{\boldsymbol{\mu}_k\right\}, \boldsymbol{\eta},\right.$$\left.\mathbf{M}, \left\{\boldsymbol{\Delta}_i\right\}\right)$, two properties over data $\boldsymbol{X}$ can be obtained.

Figures (20)

Figure 1: Examples of different heterophily patterns. The accuracy of MLP is 71.12.
Figure 2: Results on synthetic data. (a)-(c) present the results of MLP incorporating one GC operation, while (d) displays the results with multiple GC operations. In each subfigure, the gray region, which is enclosed by the minimum gain and maximum gain curves, represents the area of separability gains. The Pearson Correlation represents the Pearson correlation coefficient between the separability gains and the differences in the confusion matrix with and without the GC operations.
Figure 3: Example of mixed heterophily with $a=0.04$. The accuracy of GCN is 74.24.
Figure 4: Example of mixed heterophily with $a=0.05$. The accuracy of GCN is 70.32.
Figure 5: Example of bad heterophily with $a=0.1$. The accuracy of GCN is 45.10.
...and 15 more figures

Theorems & Definitions (40)

Definition 1
Theorem 1
Theorem 2
Definition 2
Theorem 3
Theorem 4
Proposition 1
Lemma A1: Bayesian Classifier
proof
proof
...and 30 more

Understanding Heterophily for Graph Neural Networks

TL;DR

Abstract

Understanding Heterophily for Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (20)

Theorems & Definitions (40)