Table of Contents
Fetching ...

ES-GNN: Generalizing Graph Neural Networks Beyond Homophily with Edge Splitting

Jingwei Guo, Kaizhu Huang, Rui Zhang, Xinping Yi

TL;DR

This work proposes a novel Edge Splitting GNN (ES-GNN) framework to adaptively distinguish between graph edges either relevant or irrelevant to learning tasks, and shows that it can be regarded as a solution to a disentangled graph denoising problem.

Abstract

While Graph Neural Networks (GNNs) have achieved enormous success in multiple graph analytical tasks, modern variants mostly rely on the strong inductive bias of homophily. However, real-world networks typically exhibit both homophilic and heterophilic linking patterns, wherein adjacent nodes may share dissimilar attributes and distinct labels. Therefore, GNNs smoothing node proximity holistically may aggregate both task-relevant and irrelevant (even harmful) information, limiting their ability to generalize to heterophilic graphs and potentially causing non-robustness. In this work, we propose a novel Edge Splitting GNN (ES-GNN) framework to adaptively distinguish between graph edges either relevant or irrelevant to learning tasks. This essentially transfers the original graph into two subgraphs with the same node set but complementary edge sets dynamically. Given that, information propagation separately on these subgraphs and edge splitting are alternatively conducted, thus disentangling the task-relevant and irrelevant features. Theoretically, we show that our ES-GNN can be regarded as a solution to a disentangled graph denoising problem, which further illustrates our motivations and interprets the improved generalization beyond homophily. Extensive experiments over 11 benchmark and 1 synthetic datasets not only demonstrate the effective performance of ES-GNN but also highlight its robustness to adversarial graphs and mitigation of the over-smoothing problem.

ES-GNN: Generalizing Graph Neural Networks Beyond Homophily with Edge Splitting

TL;DR

This work proposes a novel Edge Splitting GNN (ES-GNN) framework to adaptively distinguish between graph edges either relevant or irrelevant to learning tasks, and shows that it can be regarded as a solution to a disentangled graph denoising problem.

Abstract

While Graph Neural Networks (GNNs) have achieved enormous success in multiple graph analytical tasks, modern variants mostly rely on the strong inductive bias of homophily. However, real-world networks typically exhibit both homophilic and heterophilic linking patterns, wherein adjacent nodes may share dissimilar attributes and distinct labels. Therefore, GNNs smoothing node proximity holistically may aggregate both task-relevant and irrelevant (even harmful) information, limiting their ability to generalize to heterophilic graphs and potentially causing non-robustness. In this work, we propose a novel Edge Splitting GNN (ES-GNN) framework to adaptively distinguish between graph edges either relevant or irrelevant to learning tasks. This essentially transfers the original graph into two subgraphs with the same node set but complementary edge sets dynamically. Given that, information propagation separately on these subgraphs and edge splitting are alternatively conducted, thus disentangling the task-relevant and irrelevant features. Theoretically, we show that our ES-GNN can be regarded as a solution to a disentangled graph denoising problem, which further illustrates our motivations and interprets the improved generalization beyond homophily. Extensive experiments over 11 benchmark and 1 synthetic datasets not only demonstrate the effective performance of ES-GNN but also highlight its robustness to adversarial graphs and mitigation of the over-smoothing problem.
Paper Structure (33 sections, 2 theorems, 16 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 33 sections, 2 theorems, 16 equations, 9 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

The proposed ES-GNN is equivalent to the solution of the disentangled graph denoising problem in Eq. eq:disen_gsd.

Figures (9)

  • Figure 1: A toy example to show differences between conventional GNNs and our ES-GNN in aggregating node features. Conventional GNNs with local smoothness tend to produce non-discriminative representations on heterophilic graphs, while our ES-GNN is able to disentangle and exclude the task-harmful features from the final predictive target.
  • Figure 2: Illustration of ES-GNN framework where $\mathbf{A}$ and $\mathbf{X}$ denote the adjacency matrix and feature matrix of nodes, respectively. First, $\mathbf{X}$ is projected onto different latent subspaces via different channels $\text{R}$ and $\text{IR}$. An edge splitting is then performed to divide the original graph edges into two complementary sets. After that, the node information can be aggregated individually and separately on different edge sets to produce disentangled representations, which are further utilized to make an more accurate edge splitting in the next layer. The task-relevant representation $\mathbf{Z}^{'}_{\text{R}}$ is reasonably granted for prediction, and an Irrelevant Consistency Regularization (ICR) term is developed to further reduce the potential task-harmful information from the final predictive target.
  • Figure 3: Synthetic graphs with varying levels of homophily. Node shape and color refer to the explicit and implicit attributes, respectively. Nodes sharing the same shape (or color) are connected with a probability of $P_\text{E}$ (or $P_\text{I}$) and are classified into three categories only based on their different shapes. In this context, "shape" attributes represent task-relevant features, whereas "color" attributes denote task-irrelevant ones. It can be intuitively observed that adequate disentanglement of these attributes is crucial for classification tasks; otherwise, model prediction will inevitably suffer, as misled by the task-irrelevant "color" information.
  • Figure 4: Feature correlation analysis. Two distinct patterns (task-relevant and task-irrelevant topologies) can be learned on Chameleon with $\mathcal{H}=0.23$, while almost all information is retained in the task-relevant channel (0-31) on Cora with $\mathcal{H}=0.81$. On synthetic graphs in (c), (d), and (e), block-wise pattern in the task-irrelevant channel (32-63) is gradually attenuated with the incremental homophily ratios across $0.1$, $0.5$, and $0.9$. ES-GNN presents one general framework which can be adaptive for both heterophilic and homophilic graphs.
  • Figure 5: Results of different models on synthetic graphs with varied homophily ratios, where ES-GNN constantly outperform all the baselines.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Lemma 1
  • proof