Stronger Graph Transformer with Regularized Attention Scores

Eugene Ku

Stronger Graph Transformer with Regularized Attention Scores

Eugene Ku

TL;DR

A novel version of edge regularization technique is proposed that alleviates the need for Positional Encoding and ultimately alleviate GT's out of memory issue.

Abstract

Graph Neural Networks are notorious for its memory consumption. A recent Transformer-based GNN called Graph Transformer is shown to obtain superior performances when long range dependencies exist. However, combining graph data and Transformer architecture led to a combinationally worse memory issue. We propose a novel version of "edge regularization technique" that alleviates the need for Positional Encoding and ultimately alleviate GT's out of memory issue. We observe that it is not clear whether having an edge regularization on top of positional encoding is helpful. However, it seems evident that applying our edge regularization technique indeed stably improves GT's performance compared to GT without Positional Encoding.

Stronger Graph Transformer with Regularized Attention Scores

TL;DR

A novel version of edge regularization technique is proposed that alleviates the need for Positional Encoding and ultimately alleviate GT's out of memory issue.

Abstract

Paper Structure (13 sections, 2 equations, 8 figures, 1 table)

This paper contains 13 sections, 2 equations, 8 figures, 1 table.

Introduction
Background & Related Work
Evolution of Graph Neural Networks
Introduction to Graph Transformers
Limitations of Graph Transformers
GraphGPS Architecture
Proposed Method
Results
Application Study of GraphGPS
Conclusion
Appendix
Our Configuration of GraphGPS
Results of our Edge Regularization on other datasets

Figures (8)

Figure 1: Oversquashing [2]
Figure 2: Superior Performance of models with Graph Transformers on LRGB[6]
Figure 3: RWSE allows expressiveness beyond Color Refinement Algorithm [2]
Figure 4: RWSE does not guaranteed to show unique encoding for each node[2]
Figure 5: LRGB Dataset description [6]
...and 3 more figures

Stronger Graph Transformer with Regularized Attention Scores

TL;DR

Abstract

Stronger Graph Transformer with Regularized Attention Scores

Authors

TL;DR

Abstract

Table of Contents

Figures (8)