Table of Contents
Fetching ...

Graph Transformers without Positional Encodings

Ayush Garg

TL;DR

The paper tackles injecting graph inductive biases into Graph Transformers without handcrafted positional encodings, addressing locality and connectivity in permutation-free graph data. It proposes Eigenformer, where spectrum-aware attention computes potentials from Laplacian eigenvectors $u_k$ and eigenvalues $\lambda_k$ and learns frequency importances via $\phi_2(\lambda_k)$, with final attention $\alpha$ derived from $\alpha[i,j] = \mathrm{softmax}_j\left(\phi_1\left(\sum_k \sigma_k[i,j]\phi_2(\lambda_k)\right)\right)$. Theoretical contributions prove that this mechanism can express graph connectivity matrices and is invariant to eigenvector sign and basis within degenerate eigenspaces, while empirical results on benchmarks show competitive performance against state-of-the-art Graph Transformers. This PE-free approach offers a robust, scalable way to capture local and long-range structure and reduces the need for extensive hand-designed encodings.

Abstract

Recently, Transformers for graph representation learning have become increasingly popular, achieving state-of-the-art performance on a wide-variety of graph datasets, either alone or in combination with message-passing graph neural networks (MP-GNNs). Infusing graph inductive-biases in the innately structure-agnostic transformer architecture in the form of structural or positional encodings (PEs) is key to achieving these impressive results. However, designing such encodings is tricky and disparate attempts have been made to engineer such encodings including Laplacian eigenvectors, relative random-walk probabilities (RRWP), spatial encodings, centrality encodings, edge encodings etc. In this work, we argue that such encodings may not be required at all, provided the attention mechanism itself incorporates information about the graph structure. We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph, and empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks. Additionally, we theoretically prove that Eigenformer can express various graph structural connectivity matrices, which is particularly essential when learning over smaller graphs.

Graph Transformers without Positional Encodings

TL;DR

The paper tackles injecting graph inductive biases into Graph Transformers without handcrafted positional encodings, addressing locality and connectivity in permutation-free graph data. It proposes Eigenformer, where spectrum-aware attention computes potentials from Laplacian eigenvectors and eigenvalues and learns frequency importances via , with final attention derived from . Theoretical contributions prove that this mechanism can express graph connectivity matrices and is invariant to eigenvector sign and basis within degenerate eigenspaces, while empirical results on benchmarks show competitive performance against state-of-the-art Graph Transformers. This PE-free approach offers a robust, scalable way to capture local and long-range structure and reduces the need for extensive hand-designed encodings.

Abstract

Recently, Transformers for graph representation learning have become increasingly popular, achieving state-of-the-art performance on a wide-variety of graph datasets, either alone or in combination with message-passing graph neural networks (MP-GNNs). Infusing graph inductive-biases in the innately structure-agnostic transformer architecture in the form of structural or positional encodings (PEs) is key to achieving these impressive results. However, designing such encodings is tricky and disparate attempts have been made to engineer such encodings including Laplacian eigenvectors, relative random-walk probabilities (RRWP), spatial encodings, centrality encodings, edge encodings etc. In this work, we argue that such encodings may not be required at all, provided the attention mechanism itself incorporates information about the graph structure. We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph, and empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks. Additionally, we theoretically prove that Eigenformer can express various graph structural connectivity matrices, which is particularly essential when learning over smaller graphs.
Paper Structure (15 sections, 2 theorems, 18 equations, 5 figures, 6 tables)

This paper contains 15 sections, 2 theorems, 18 equations, 5 figures, 6 tables.

Key Result

Proposition 1

For any $n \in \mathbb{N}$, consider the adjacency matrix $A$ drawn from the set of adjacency matrices of n-node undirected graphs, $\mathbb{G}_n \subset \{0,1\}^{n \times n}$. Further, let $L_{norm} = I - D^{-\frac{1}{2}}AD^{-\frac{1}{2}} = I - A_{norm}$ be the normalized graph Laplacian of the gra for suitable functions $\phi_1$ and $\phi_2$, where $SPD[i,j]$ is the shortest path distance betwee

Figures (5)

  • Figure 1: Example molecule from PCQM4Mv2 dataset: Substructures are revealed by eigenvectors
  • Figure 2: Eigenformer Architecture
  • Figure 3: (Smoothed) potential $\sigma_k$ vs eigenvalue $\lambda_k$
  • Figure 4: k-hop neighborhood, shortest-path distance and learned attention matrices for an example graph from the ZINC dataset
  • Figure 5: Percentage change in MAE with decreasing number (k) of eigenvalues used

Theorems & Definitions (2)

  • Proposition 1
  • Proposition 2