MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs
Kiarash Shamsi, Tran Gia Bao Ngo, Razieh Shirzadkhani, Shenyang Huang, Farimah Poursafaei, Poupak Azad, Reihaneh Rabbany, Baris Coskunuzer, Guillaume Rabusseau, Cuneyt Gurcan Akcora
TL;DR
MiNT tackles transfer learning in temporal graphs by pre-training TGNNs across multiple networks and enabling zero-shot inference on unseen networks. The framework introduces MiNT-train with order shuffling and context switching, trains on a diverse set of 84 Ethereum-based temporal networks (64 for training, 20 unseen), and demonstrates that increasing the number of pre-training networks yields stronger transfer performance, achieving competitive or state-of-the-art results on numerous unseen cases. The authors also provide a thorough ablation and data-selection analysis, revealing the importance of memory resets and data randomness for generalization, and report scalable training behavior on commercial-grade GPUs. By releasing the MiNT datasets and code, this work lays the groundwork for Temporal Graph Foundation Models and broad applicability to dynamic network tasks such as financial forecasting and transaction analysis.
Abstract
Temporal Graph Learning (TGL) has become a robust framework for discovering patterns in dynamic networks and predicting future interactions. While existing research has largely concentrated on learning from individual networks, this study explores the potential of learning from multiple temporal networks and its ability to transfer to unobserved networks. To achieve this, we introduce Temporal Multi-network Training MiNT, a novel pre-training approach that learns from multiple temporal networks. With a novel collection of 84 temporal transaction networks, we pre-train TGL models on up to 64 networks and assess their transferability to 20 unseen networks. Remarkably, MiNT achieves state-of-the-art results in zero-shot inference, surpassing models individually trained on each network. Our findings further demonstrate that increasing the number of pre-training networks significantly improves transfer performance. This work lays the groundwork for developing Temporal Graph Foundation Models, highlighting the significant potential of multi-network pre-training in TGL.
