Graph Propagation Transformer for Graph Representation Learning

Zhe Chen; Hao Tan; Tao Wang; Tianrun Shen; Tong Lu; Qiuying Peng; Cheng Cheng; Yue Qi

Graph Propagation Transformer for Graph Representation Learning

Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi

TL;DR

Results show that the GPTrans method outperforms many state-of-the-art transformer-based graph models with better performance, and is designed to further help learn graph data.

Abstract

This paper presents a novel transformer architecture for graph representation learning. The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks. Specifically, we propose a new attention mechanism called Graph Propagation Attention (GPA). It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node, which is essential for learning graph-structured data. On this basis, we design an effective transformer architecture named Graph Propagation Transformer (GPTrans) to further help learn graph data. We verify the performance of GPTrans in a wide range of graph learning experiments on several benchmark datasets. These results show that our method outperforms many state-of-the-art transformer-based graph models with better performance. The code will be released at https://github.com/czczup/GPTrans.

Graph Propagation Transformer for Graph Representation Learning

TL;DR

Results show that the GPTrans method outperforms many state-of-the-art transformer-based graph models with better performance, and is designed to further help learn graph data.

Abstract

Paper Structure (31 sections, 10 equations, 3 figures, 6 tables)

This paper contains 31 sections, 10 equations, 3 figures, 6 tables.

Introduction
Related Works
Transformer
Graph Convolutional Network
GPTrans
Overall Architecture
Graph Embedding
Graph Propagation Attention
Node-to-Node
Node-to-Edge
Edge-to-Node
GPA in Transformer Blocks
Architecture Configurations
Experiments
Graph-Level Tasks
...and 16 more sections

Figures (3)

Figure 1: Illustration of the three ways for graph information propagation. Circles and black lines indicate nodes and edges, and green and pink cubes represent node embeddings and edge embeddings. Our GPTrans achieves better graph representation learning by explicitly constructing three ways for information propagation in the proposed Graph Propagation Attention (GPA) module, including (a) node-to-node, (b) node-to-edge, and (c) edge-to-node.
Figure 2: Overall architecture of GPTrans. It contains a graph embedding layer, $L$ transformer blocks, and a head. The graph embedding layer transforms the graph data into node embeddings $x_{\rm node}$ and edge embeddings $x_{\rm edge}$, as the input of the transformer blocks. Each transformer block comprises a Graph Propagation Attention (GPA) and a Feed-Forward Network (FFN). It is worth noting that we no longer need to maintain an FFN module specifically for edge embeddings due to the proposed GPA module, which improves the efficiency of our method. Finally, a head of two fully-connected layers is employed on the output embeddings for various graph tasks.
Figure 3: Illustration of Graph Propagation Attention. It explicitly builds three paths for information propagation among node embeddings and edge embeddings, including (a) node-to-node, (b) node-to-edge, and (c) edge-to-node.

Graph Propagation Transformer for Graph Representation Learning

TL;DR

Abstract

Graph Propagation Transformer for Graph Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)