A Comparative Study on Dynamic Graph Embedding based on Mamba and Transformers
Ashish Parmanand Pandey, Alan John Varghese, Sarang Patil, Mengjia Xu
TL;DR
This work compares transformer-based and state-space-based approaches for dynamic graph embedding, focusing on scalable temporal link prediction. It introduces three probabilistic models—ST-TransformerG2G, $\mathcal{DG}$-Mamba, and $\mathcal{GDG}$-Mamba—that map nodes to Gaussian embeddings $\mathcal{N}(\mu_i^t, \Sigma_i^t)$ using a history of $l$ timestamps and, in the case of GDG-Mamba, edge-aware spatial processing via GINE convolutions. Empirical results across five benchmarks show that DG-Mamba and GDG-Mamba achieve comparable or superior accuracy to ST-TransformerG2G while offering substantial computational efficiency, particularly on long sequences and highly dynamic graphs; Slashdot remains a challenging case where ST-TransformerG2G excels. The analysis of the learned state-transition matrix $A$ in Mamba reveals attention-like temporal focusing, underscoring the models’ ability to capture complex temporal patterns with linear-time complexity. Overall, the work demonstrates that selective state-space models can effectively scale dynamic graph representation learning to larger, real-world networks, with strong implications for social, financial, and biological domains.
Abstract
Dynamic graph embedding has emerged as an important technique for modeling complex time-evolving networks across diverse domains. While transformer-based models have shown promise in capturing long-range dependencies in temporal graph data, they face scalability challenges due to quadratic computational complexity. This study presents a comparative analysis of dynamic graph embedding approaches using transformers and the recently proposed Mamba architecture, a state-space model with linear complexity. We introduce three novel models: TransformerG2G augment with graph convolutional networks, \mathcal{DG}-Mamba, and \mathcal{GDG}-Mamba with graph isomorphism network edge convolutions. Our experiments on multiple benchmark datasets demonstrate that Mamba-based models achieve comparable or superior performance to transformer-based approaches in link prediction tasks while offering significant computational efficiency gains on longer sequences. Notably, \mathcal{DG}-Mamba variants consistently outperform transformer-based models on datasets with high temporal variability, such as UCI, Bitcoin, and Reality Mining, while maintaining competitive performance on more stable graphs like SBM. We provide insights into the learned temporal dependencies through analysis of attention weights and state matrices, revealing the models' ability to capture complex temporal patterns. By effectively combining state-space models with graph neural networks, our work addresses key limitations of previous approaches and contributes to the growing body of research on efficient temporal graph representation learning. These findings offer promising directions for scaling dynamic graph embedding to larger, more complex real-world networks, potentially enabling new applications in areas such as social network analysis, financial modeling, and biological system dynamics.
