Table of Contents
Fetching ...

Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding

Zhe Wang, Sheng Zhou, Jiawei Chen, Zhen Zhang, Binbin Hu, Yan Feng, Chun Chen, Can Wang

TL;DR

This work tackles representation learning on Continuous-Time Dynamic Graphs by modeling the evolving spatial-temporal proximity between nodes. It introduces CorDGT, a Transformer-based framework that samples contextual nodes and encodes their proximity to target node pairs through Correlated Spatial-Temporal Positional Encoding STPE-C, built upon a parameter-free Temporal Distance derived from a Poisson Point Process. The method unifies spatial distance, temporal distance, and high-order proximity into a learnable encoding, enabling efficient attention-based aggregation without costly intensity pretraining. Extensive experiments on nine CTDG datasets, including large Gowalla subsets, demonstrate superior predictive performance and favorable scalability, with ablations confirming the importance of STPE-C and the Poisson-based distance components for capturing evolving proximity. The approach offers a practical advancement for dynamic graph representation learning with improved accuracy and efficiency for downstream link prediction and node classification tasks.

Abstract

Learning effective representations for Continuous-Time Dynamic Graphs (CTDGs) has garnered significant research interest, largely due to its powerful capabilities in modeling complex interactions between nodes. A fundamental and crucial requirement for representation learning in CTDGs is the appropriate estimation and preservation of proximity. However, due to the sparse and evolving characteristics of CTDGs, the spatial-temporal properties inherent in high-order proximity remain largely unexplored. Despite its importance, this property presents significant challenges due to the computationally intensive nature of personalized interaction intensity estimation and the dynamic attributes of CTDGs. To this end, we propose a novel Correlated Spatial-Temporal Positional encoding that incorporates a parameter-free personalized interaction intensity estimation under the weak assumption of the Poisson Point Process. Building on this, we introduce the Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding (CorDGT), which efficiently retains the evolving spatial-temporal high-order proximity for effective node representation learning in CTDGs. Extensive experiments on seven small and two large-scale datasets demonstrate the superior performance and scalability of the proposed CorDGT. The code is available at: https://github.com/wangz3066/CorDGT.

Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding

TL;DR

This work tackles representation learning on Continuous-Time Dynamic Graphs by modeling the evolving spatial-temporal proximity between nodes. It introduces CorDGT, a Transformer-based framework that samples contextual nodes and encodes their proximity to target node pairs through Correlated Spatial-Temporal Positional Encoding STPE-C, built upon a parameter-free Temporal Distance derived from a Poisson Point Process. The method unifies spatial distance, temporal distance, and high-order proximity into a learnable encoding, enabling efficient attention-based aggregation without costly intensity pretraining. Extensive experiments on nine CTDG datasets, including large Gowalla subsets, demonstrate superior predictive performance and favorable scalability, with ablations confirming the importance of STPE-C and the Poisson-based distance components for capturing evolving proximity. The approach offers a practical advancement for dynamic graph representation learning with improved accuracy and efficiency for downstream link prediction and node classification tasks.

Abstract

Learning effective representations for Continuous-Time Dynamic Graphs (CTDGs) has garnered significant research interest, largely due to its powerful capabilities in modeling complex interactions between nodes. A fundamental and crucial requirement for representation learning in CTDGs is the appropriate estimation and preservation of proximity. However, due to the sparse and evolving characteristics of CTDGs, the spatial-temporal properties inherent in high-order proximity remain largely unexplored. Despite its importance, this property presents significant challenges due to the computationally intensive nature of personalized interaction intensity estimation and the dynamic attributes of CTDGs. To this end, we propose a novel Correlated Spatial-Temporal Positional encoding that incorporates a parameter-free personalized interaction intensity estimation under the weak assumption of the Poisson Point Process. Building on this, we introduce the Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding (CorDGT), which efficiently retains the evolving spatial-temporal high-order proximity for effective node representation learning in CTDGs. Extensive experiments on seven small and two large-scale datasets demonstrate the superior performance and scalability of the proposed CorDGT. The code is available at: https://github.com/wangz3066/CorDGT.
Paper Structure (51 sections, 1 theorem, 17 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 51 sections, 1 theorem, 17 equations, 6 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

Suppose the interactions between $w$ and $w_0$ prior to $t_{pred}$, denoted as $T(w,w_0,t_{pred})= \{t_1, ..., t_n \}$ with $t_{i-1} < t_{i}(i=2,...,n)$ and $t_{n} < t_{pred}$, follow a Poisson point process with intensity $\lambda$, then the maximum likelihood estimation of $\lambda$ is $\dfrac{n}{

Figures (6)

  • Figure 1: A social network example. The model is expected to predict the existence of the interaction between node $u$ and $v$ at $t=11$.
  • Figure 2: The framework of the proposed CorDGT.
  • Figure 3: The inductive AP metrics and the training time per epoch on Food datasets. The closer to the upper left corner, the better performance. CorDGT-10 and CorDGT-20 denote the CorDGT with 10 and 20 contextual nodes, respectively.
  • Figure 4: Heatmap values indicate the confidence score on positive source/target node pairs predicted by different groups of contextual nodes. Left to right: UCI and Enron datasets. Top to bottom: the contextual nodes are grouped according to their Spatial Distance and Temporal Distance to source/target nodes. The blank cells indicate that no data is allocated to this group. Best viewed in color.
  • Figure 5: The inductive AP metrics and the training time per epoch on Outdoors datasets. The closer to the upper left corner, the better performance. CorDGT-10 and CorDGT-20 denote the CorDGT with 10 and 20 contextual nodes, respectively.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof