Supra-Laplacian Encoding for Transformer on Dynamic Graphs

Yannis Karmim; Marc Lafon; Raphael Fournier S'niehotta; Nicolas Thome

Supra-Laplacian Encoding for Transformer on Dynamic Graphs

Yannis Karmim, Marc Lafon, Raphael Fournier S'niehotta, Nicolas Thome

TL;DR

This work introduces Supra-LAplacian encoding for spatio-temporal TransformErs (SLATE), a new spatio-temporal encoding to leverage the GT architecture while keeping spatio-temporal information.

Abstract

Fully connected Graph Transformers (GT) have rapidly become prominent in the static graph community as an alternative to Message-Passing models, which suffer from a lack of expressivity, oversquashing, and under-reaching. However, in a dynamic context, by interconnecting all nodes at multiple snapshots with self-attention, GT loose both structural and temporal information. In this work, we introduce Supra-LAplacian encoding for spatio-temporal TransformErs (SLATE), a new spatio-temporal encoding to leverage the GT architecture while keeping spatio-temporal information. Specifically, we transform Discrete Time Dynamic Graphs into multi-layer graphs and take advantage of the spectral properties of their associated supra-Laplacian matrix. Our second contribution explicitly model nodes' pairwise relationships with a cross-attention mechanism, providing an accurate edge representation for dynamic link prediction. SLATE outperforms numerous state-of-the-art methods based on Message-Passing Graph Neural Networks combined with recurrent models (e.g LSTM), and Dynamic Graph Transformers, on 9 datasets. Code is available at: github.com/ykrmm/SLATE.

Supra-Laplacian Encoding for Transformer on Dynamic Graphs

TL;DR

This work introduces Supra-LAplacian encoding for spatio-temporal TransformErs (SLATE), a new spatio-temporal encoding to leverage the GT architecture while keeping spatio-temporal information.

Abstract

Paper Structure (37 sections, 10 equations, 11 figures, 17 tables, 1 algorithm)

This paper contains 37 sections, 10 equations, 11 figures, 17 tables, 1 algorithm.

Introduction
Related work
Dynamic Graph Neural Networks on DTDGs.
Graph Transformer.
Dynamic Graph Transformers.
Dynamic Link Prediction methods.
The SLATE Method
Supra-Laplacian as Spatio-Temporal Encoding
Fully-connected spatio-temporal transformer
Edge Representation with Cross-Attention
SLATE Scalability
Experiments
Comparison to state-of-the-art
Model Analysis
Impact of the time-window size.
...and 22 more sections

Figures (11)

Figure 1: SLATE is a fully connected transformer for dynamic link prediction, which innovatively performs a joint spatial and temporal encoding of the dynamic graph. SLATE models a DTDG as a multi-layer graph with temporal dependencies between a node and its past. Building the supra-adjacency matrix of a randomly-generated toy dynamic graph with 3 snapshots (left) and analysing the spectrum of its associated supra-Laplacian (right) provide fundamental spatio-temporal information. The projections on eigenvectors associated with smaller eigenvalues ($\lambda_1$) capture global graph dynamics: node colors are different for each time step. Larger eigenvalues ( e.g.$\lambda_{\text{max}}$), capture more localized spatio-temporal information (see \ref{['app:rw_multilayer']}).
Figure 2: The SLATE model for link prediction with dynamic graph transformers (DGTs). To recover the lost spatio-temporal structure in DGTs, we adapt the supra-Laplacian matrix computation to DGTs by making the input graph provably connected (a), and use its spectral analysis to introduce a specific encoding for DGTs (b). (c) Applies a fully connected spatio-temporal transformer between all nodes at multiple time-step. Finally, we design in (d) an edge representations module dedicated to link prediction using cross-attention on multiple temporal representations of the nodes.
Figure 3: Average percentage of isolated nodes per snapshot on real world dynamic graphs data.
Figure 4: Importance of connectivity transformations steps to connect the supra-adjacency matrix. AUC performance in dynamic link prediction.
Figure 5: An analysis of model efficiency comparing the memory usage (Mem.), training time per epoch (t/ep.) and the number of parameters (Nb params) on Flights dataset
...and 6 more figures

Supra-Laplacian Encoding for Transformer on Dynamic Graphs

TL;DR

Abstract

Supra-Laplacian Encoding for Transformer on Dynamic Graphs

Authors

TL;DR

Abstract

Table of Contents

Figures (11)