Table of Contents
Fetching ...

DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning

Xi Chen, Yun Xiong, Siwei Zhang, Jiawei Zhang, Yao Zhang, Shiyang Zhou, Xixi Wu, Mingyang Zhang, Tengfei Liu, Weiqiang Wang

TL;DR

DTFormer replaces GNN+RNN with a Transformer-based approach for Discrete-Time Dynamic Graphs, utilizing first-hop neighbor sequences and a five-feature construction to capture both topological context and temporal evolution. A novel multi-patching module enables multi-granularity sequence processing, while a neighbor-intersection feature enhances the modeling of interactions between candidate nodes. Across six public benchmarks, DTFormer achieves state-of-the-art performance in MRR, AUC-ROC, and AP, with improved scalability and reduced memory usage compared to prior GNN+RNN methods. The work demonstrates the feasibility and benefits of Transformer-based DTDG representation learning and provides a flexible framework for future enhancements in dynamic graph analysis.

Abstract

Discrete-Time Dynamic Graphs (DTDGs), which are prevalent in real-world implementations and notable for their ease of data acquisition, have garnered considerable attention from both academic researchers and industry practitioners. The representation learning of DTDGs has been extensively applied to model the dynamics of temporally changing entities and their evolving connections. Currently, DTDG representation learning predominantly relies on GNN+RNN architectures, which manifest the inherent limitations of both Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs). GNNs suffer from the over-smoothing issue as the models architecture goes deeper, while RNNs struggle to capture long-term dependencies effectively. GNN+RNN architectures also grapple with scaling to large graph sizes and long sequences. Additionally, these methods often compute node representations separately and focus solely on individual node characteristics, thereby overlooking the behavior intersections between the two nodes whose link is being predicted, such as instances where the two nodes appear together in the same context or share common neighbors. This paper introduces a novel representation learning method DTFormer for DTDGs, pivoting from the traditional GNN+RNN framework to a Transformer-based architecture. Our approach exploits the attention mechanism to concurrently process topological information within the graph at each timestamp and temporal dynamics of graphs along the timestamps, circumventing the aforementioned fundamental weakness of both GNNs and RNNs. Moreover, we enhance the model's expressive capability by incorporating the intersection relationships among nodes and integrating a multi-patching module. Extensive experiments conducted on six public dynamic graph benchmark datasets confirm our model's efficacy, achieving the SOTA performance.

DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning

TL;DR

DTFormer replaces GNN+RNN with a Transformer-based approach for Discrete-Time Dynamic Graphs, utilizing first-hop neighbor sequences and a five-feature construction to capture both topological context and temporal evolution. A novel multi-patching module enables multi-granularity sequence processing, while a neighbor-intersection feature enhances the modeling of interactions between candidate nodes. Across six public benchmarks, DTFormer achieves state-of-the-art performance in MRR, AUC-ROC, and AP, with improved scalability and reduced memory usage compared to prior GNN+RNN methods. The work demonstrates the feasibility and benefits of Transformer-based DTDG representation learning and provides a flexible framework for future enhancements in dynamic graph analysis.

Abstract

Discrete-Time Dynamic Graphs (DTDGs), which are prevalent in real-world implementations and notable for their ease of data acquisition, have garnered considerable attention from both academic researchers and industry practitioners. The representation learning of DTDGs has been extensively applied to model the dynamics of temporally changing entities and their evolving connections. Currently, DTDG representation learning predominantly relies on GNN+RNN architectures, which manifest the inherent limitations of both Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs). GNNs suffer from the over-smoothing issue as the models architecture goes deeper, while RNNs struggle to capture long-term dependencies effectively. GNN+RNN architectures also grapple with scaling to large graph sizes and long sequences. Additionally, these methods often compute node representations separately and focus solely on individual node characteristics, thereby overlooking the behavior intersections between the two nodes whose link is being predicted, such as instances where the two nodes appear together in the same context or share common neighbors. This paper introduces a novel representation learning method DTFormer for DTDGs, pivoting from the traditional GNN+RNN framework to a Transformer-based architecture. Our approach exploits the attention mechanism to concurrently process topological information within the graph at each timestamp and temporal dynamics of graphs along the timestamps, circumventing the aforementioned fundamental weakness of both GNNs and RNNs. Moreover, we enhance the model's expressive capability by incorporating the intersection relationships among nodes and integrating a multi-patching module. Extensive experiments conducted on six public dynamic graph benchmark datasets confirm our model's efficacy, achieving the SOTA performance.
Paper Structure (19 sections, 11 equations, 4 figures, 3 tables)

This paper contains 19 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of DTFormer: First, we construct neighbor sequences for nodes $i$ and $j$ and format five distinct features, encoding them to create feature sequences. Next, we apply the multi-patching module and corresponding Transformer encoders to obtain the embeddings of nodes $i$ and $j$, which are then concatenated and used to predict the future link.
  • Figure 2: Formatting and Encoding Features: For nodes $i$ and $j$, we format node, edge, positional, occurrence, and intersect features and encode them to create feature sequences.
  • Figure 3: Multi-Patching: Applying multiple patch sizes to split feature sequences helps our model capture information at different granularities and reduce model complexity.
  • Figure 4: Experiment results for the empirical study of the multi-patching module, reporting the AP(%) and AUC-ROC(%).