Table of Contents
Fetching ...

Representation Learning on Heterophilic Graph with Directional Neighborhood Attention

Qincheng Lu, Jiaqi Zhu, Sitao Luan, Xiao-Wen Chang

TL;DR

The proposed Directional Graph Attention Network (DGAT) is able to combine the feature-based attention with the global directional information extracted from the graph topology and outperforms the state-of-the-art (SOTA) models on 6 out of 7 real-world benchmark datasets.

Abstract

Graph Attention Network (GAT) is one of the most popular Graph Neural Network (GNN) architecture, which employs the attention mechanism to learn edge weights and has demonstrated promising performance in various applications. However, since it only incorporates information from immediate neighborhood, it lacks the ability to capture long-range and global graph information, leading to unsatisfactory performance on some datasets, particularly on heterophilic graphs. To address this limitation, we propose the Directional Graph Attention Network (DGAT) in this paper. DGAT is able to combine the feature-based attention with the global directional information extracted from the graph topology. To this end, a new class of Laplacian matrices is proposed which can provably reduce the diffusion distance between nodes. Based on the new Laplacian, topology-guided neighbour pruning and edge adding mechanisms are proposed to remove the noisy and capture the helpful long-range neighborhood information. Besides, a global directional attention is designed to enable a topological-aware information propagation. The superiority of the proposed DGAT over the baseline GAT has also been verified through experiments on real-world benchmarks and synthetic data sets. It also outperforms the state-of-the-art (SOTA) models on 6 out of 7 real-world benchmark datasets.

Representation Learning on Heterophilic Graph with Directional Neighborhood Attention

TL;DR

The proposed Directional Graph Attention Network (DGAT) is able to combine the feature-based attention with the global directional information extracted from the graph topology and outperforms the state-of-the-art (SOTA) models on 6 out of 7 real-world benchmark datasets.

Abstract

Graph Attention Network (GAT) is one of the most popular Graph Neural Network (GNN) architecture, which employs the attention mechanism to learn edge weights and has demonstrated promising performance in various applications. However, since it only incorporates information from immediate neighborhood, it lacks the ability to capture long-range and global graph information, leading to unsatisfactory performance on some datasets, particularly on heterophilic graphs. To address this limitation, we propose the Directional Graph Attention Network (DGAT) in this paper. DGAT is able to combine the feature-based attention with the global directional information extracted from the graph topology. To this end, a new class of Laplacian matrices is proposed which can provably reduce the diffusion distance between nodes. Based on the new Laplacian, topology-guided neighbour pruning and edge adding mechanisms are proposed to remove the noisy and capture the helpful long-range neighborhood information. Besides, a global directional attention is designed to enable a topological-aware information propagation. The superiority of the proposed DGAT over the baseline GAT has also been verified through experiments on real-world benchmarks and synthetic data sets. It also outperforms the state-of-the-art (SOTA) models on 6 out of 7 real-world benchmark datasets.
Paper Structure (35 sections, 3 theorems, 46 equations, 2 figures, 8 tables)

This paper contains 35 sections, 3 theorems, 46 equations, 2 figures, 8 tables.

Key Result

Theorem 3.2

The $\mathbf{P}^{(\alpha, \gamma)}$ defined in def:adj is non-negative (i.e., all of its elements are non-negative), and when $\alpha=1$, $\mathbf{P}^{(\alpha, \gamma)}{\boldsymbol{1}}= {\boldsymbol{1}}$. See the proof in Appendix appendix:proof_parameterized_matrix.

Figures (2)

  • Figure 1: The workflow of DGAT.
  • Figure 2: The performance of a one layer GCN with aggregation weights defined by $\mathbf{P}^{(1,\gamma)}$ on synthetic graphs. Results are shown separately for heterophily ( $\mu < 0.5$) on the left panel and homophily ($\mu \geq 0.5$) on the right, with the solid line as the mean accuracy over 5 splits. The optimal $\gamma$ for each $\mu$ is highlighted using square.

Theorems & Definitions (7)

  • Definition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • proof
  • proof
  • proof