Table of Contents
Fetching ...

A Comparative Study on Dynamic Graph Embedding based on Mamba and Transformers

Ashish Parmanand Pandey, Alan John Varghese, Sarang Patil, Mengjia Xu

TL;DR

This work compares transformer-based and state-space-based approaches for dynamic graph embedding, focusing on scalable temporal link prediction. It introduces three probabilistic models—ST-TransformerG2G, $\mathcal{DG}$-Mamba, and $\mathcal{GDG}$-Mamba—that map nodes to Gaussian embeddings $\mathcal{N}(\mu_i^t, \Sigma_i^t)$ using a history of $l$ timestamps and, in the case of GDG-Mamba, edge-aware spatial processing via GINE convolutions. Empirical results across five benchmarks show that DG-Mamba and GDG-Mamba achieve comparable or superior accuracy to ST-TransformerG2G while offering substantial computational efficiency, particularly on long sequences and highly dynamic graphs; Slashdot remains a challenging case where ST-TransformerG2G excels. The analysis of the learned state-transition matrix $A$ in Mamba reveals attention-like temporal focusing, underscoring the models’ ability to capture complex temporal patterns with linear-time complexity. Overall, the work demonstrates that selective state-space models can effectively scale dynamic graph representation learning to larger, real-world networks, with strong implications for social, financial, and biological domains.

Abstract

Dynamic graph embedding has emerged as an important technique for modeling complex time-evolving networks across diverse domains. While transformer-based models have shown promise in capturing long-range dependencies in temporal graph data, they face scalability challenges due to quadratic computational complexity. This study presents a comparative analysis of dynamic graph embedding approaches using transformers and the recently proposed Mamba architecture, a state-space model with linear complexity. We introduce three novel models: TransformerG2G augment with graph convolutional networks, \mathcal{DG}-Mamba, and \mathcal{GDG}-Mamba with graph isomorphism network edge convolutions. Our experiments on multiple benchmark datasets demonstrate that Mamba-based models achieve comparable or superior performance to transformer-based approaches in link prediction tasks while offering significant computational efficiency gains on longer sequences. Notably, \mathcal{DG}-Mamba variants consistently outperform transformer-based models on datasets with high temporal variability, such as UCI, Bitcoin, and Reality Mining, while maintaining competitive performance on more stable graphs like SBM. We provide insights into the learned temporal dependencies through analysis of attention weights and state matrices, revealing the models' ability to capture complex temporal patterns. By effectively combining state-space models with graph neural networks, our work addresses key limitations of previous approaches and contributes to the growing body of research on efficient temporal graph representation learning. These findings offer promising directions for scaling dynamic graph embedding to larger, more complex real-world networks, potentially enabling new applications in areas such as social network analysis, financial modeling, and biological system dynamics.

A Comparative Study on Dynamic Graph Embedding based on Mamba and Transformers

TL;DR

This work compares transformer-based and state-space-based approaches for dynamic graph embedding, focusing on scalable temporal link prediction. It introduces three probabilistic models—ST-TransformerG2G, -Mamba, and -Mamba—that map nodes to Gaussian embeddings using a history of timestamps and, in the case of GDG-Mamba, edge-aware spatial processing via GINE convolutions. Empirical results across five benchmarks show that DG-Mamba and GDG-Mamba achieve comparable or superior accuracy to ST-TransformerG2G while offering substantial computational efficiency, particularly on long sequences and highly dynamic graphs; Slashdot remains a challenging case where ST-TransformerG2G excels. The analysis of the learned state-transition matrix in Mamba reveals attention-like temporal focusing, underscoring the models’ ability to capture complex temporal patterns with linear-time complexity. Overall, the work demonstrates that selective state-space models can effectively scale dynamic graph representation learning to larger, real-world networks, with strong implications for social, financial, and biological domains.

Abstract

Dynamic graph embedding has emerged as an important technique for modeling complex time-evolving networks across diverse domains. While transformer-based models have shown promise in capturing long-range dependencies in temporal graph data, they face scalability challenges due to quadratic computational complexity. This study presents a comparative analysis of dynamic graph embedding approaches using transformers and the recently proposed Mamba architecture, a state-space model with linear complexity. We introduce three novel models: TransformerG2G augment with graph convolutional networks, \mathcal{DG}-Mamba, and \mathcal{GDG}-Mamba with graph isomorphism network edge convolutions. Our experiments on multiple benchmark datasets demonstrate that Mamba-based models achieve comparable or superior performance to transformer-based approaches in link prediction tasks while offering significant computational efficiency gains on longer sequences. Notably, \mathcal{DG}-Mamba variants consistently outperform transformer-based models on datasets with high temporal variability, such as UCI, Bitcoin, and Reality Mining, while maintaining competitive performance on more stable graphs like SBM. We provide insights into the learned temporal dependencies through analysis of attention weights and state matrices, revealing the models' ability to capture complex temporal patterns. By effectively combining state-space models with graph neural networks, our work addresses key limitations of previous approaches and contributes to the growing body of research on efficient temporal graph representation learning. These findings offer promising directions for scaling dynamic graph embedding to larger, more complex real-world networks, potentially enabling new applications in areas such as social network analysis, financial modeling, and biological system dynamics.

Paper Structure

This paper contains 22 sections, 18 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: The proposed ST-TransformerG2G model architecture enhanced the previous TransformerG2G model varghese2024transformerg2g with GCNs. An additional GCN block, consisting of three GCN layers, was added into the TransformerG2G model, to explicitly capture the spatial interactions for each graph snapshot. The generated node embeddings were fed into a vanilla transformer encoder module, with positional encoding added to each input node token embedding.The final output representation of each node is a multivariate normal distribution - $\mathcal{N}(\mu_i^t, \Sigma_i^t)$, where $\Sigma_i^t = diag(\sigma_i^t)$.
  • Figure 2: The main architecture of the proposed $\mathcal{DG}$-Mamba model. It processes a sequence of discrete-time graph snapshots $\{G_t\}_{t=1}^T$, where a look-back parameter $l = \{1, 2, 3, 4, 5\}$ allows for historical context integration. Each graph snapshot first goes through projection and convolution to capture localized node features while maintaining spatial relationships. The SSM layer is followed to efficiently capture long-range temporal dependencies using the selective scan mechanism (Mamba architecture). The output of this Mamba layer is then passed through the activation function for non-linearity, followed by mean pooling to generate an aggregate representation. A linear projection layer with tanh activation is used to refine the node embeddings. An additional two projection heads (including one linear projection layer and one nonlinear projection mapping with ELU activation function) are used to obtain the mean and variance of Gaussian embeddings.
  • Figure 3: The main architecture of the $\mathcal{GDG}$-Mamba model. It first processes a series of discrete-time graph snapshots $\{G_t\}_{t=1}^T$ using the GINE convolution to enhance the spatial representation of the graph by considering both node and edge-level features at each timestamp. The generated graph sequence representations are then processed through the Mamba block to capture temporal dynamics, followed by mean pooling and a linear layer with tanh nonlinearity before outputting the final Gaussian embeddings. The model incorporates node and edge-level features across both spatial and temporal dimensions.
  • Figure 4: Comparison of MAP (1st column) and MRR (2nd colum) for temporal link prediction task for DynG2G, TransformerG2G, ST-TransformerG2G, $\mathcal{DG}$-Mamba, $\mathcal{GDG}$-Mamba models.
  • Figure 5: Comparison of $\mathcal{DG}$-Mamba state transition matrices (Columns 1, 3, 5) and TransformerG2G attention matrices (Columns 2, 4, 6) across 24 randomly selected time steps for Reality Mining.
  • ...and 2 more figures