MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Kiarash Shamsi; Tran Gia Bao Ngo; Razieh Shirzadkhani; Shenyang Huang; Farimah Poursafaei; Poupak Azad; Reihaneh Rabbany; Baris Coskunuzer; Guillaume Rabusseau; Cuneyt Gurcan Akcora

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Kiarash Shamsi, Tran Gia Bao Ngo, Razieh Shirzadkhani, Shenyang Huang, Farimah Poursafaei, Poupak Azad, Reihaneh Rabbany, Baris Coskunuzer, Guillaume Rabusseau, Cuneyt Gurcan Akcora

TL;DR

MiNT tackles transfer learning in temporal graphs by pre-training TGNNs across multiple networks and enabling zero-shot inference on unseen networks. The framework introduces MiNT-train with order shuffling and context switching, trains on a diverse set of 84 Ethereum-based temporal networks (64 for training, 20 unseen), and demonstrates that increasing the number of pre-training networks yields stronger transfer performance, achieving competitive or state-of-the-art results on numerous unseen cases. The authors also provide a thorough ablation and data-selection analysis, revealing the importance of memory resets and data randomness for generalization, and report scalable training behavior on commercial-grade GPUs. By releasing the MiNT datasets and code, this work lays the groundwork for Temporal Graph Foundation Models and broad applicability to dynamic network tasks such as financial forecasting and transaction analysis.

Abstract

Temporal Graph Learning (TGL) has become a robust framework for discovering patterns in dynamic networks and predicting future interactions. While existing research has largely concentrated on learning from individual networks, this study explores the potential of learning from multiple temporal networks and its ability to transfer to unobserved networks. To achieve this, we introduce Temporal Multi-network Training MiNT, a novel pre-training approach that learns from multiple temporal networks. With a novel collection of 84 temporal transaction networks, we pre-train TGL models on up to 64 networks and assess their transferability to 20 unseen networks. Remarkably, MiNT achieves state-of-the-art results in zero-shot inference, surpassing models individually trained on each network. Our findings further demonstrate that increasing the number of pre-training networks significantly improves transfer performance. This work lays the groundwork for developing Temporal Graph Foundation Models, highlighting the significant potential of multi-network pre-training in TGL.

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

TL;DR

Abstract

Paper Structure (28 sections, 7 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 28 sections, 7 equations, 9 figures, 9 tables, 1 algorithm.

Introduction
Related Work
Temporal Graph Learning.
Graph Foundation Models.
Background
MiNT: Temporal Multi-network Training
Multi-network Training
MiNT Datasets
Experiments
Contenders And Baselines
Persistence Forecast.
Single Models.
MiNT Models.
Computional Resource.
Experimental Results
...and 13 more sections

Figures (9)

Figure 1: Scaling behavior of MiNT on unseen networks. Zero-shot inference performance of MiNT (Multi-Network Model) on unseen networks, compared with standard training of individual networks (Single Model). The base TGNN models are (a) HTGN, and (b) GC-LSTM. A single model is trained and tested on each test network, while MiNT performs zero-shot inference on each test network. The metric is the average ROC AUC over 20 test networks.
Figure 2: MiNT framework. Temporal graphs are preprocessed to generate discrete-time snapshots. Next, the multi-network training pipeline leverages these snapshots to train TGNNs across multiple networks for zero-shot inference on unseen temporal networks.
Figure 3: Network statistics of MiNT networks. (a) Novelty score, (b) number of days, (c) number of nodes, and (d) number of edges.
Figure 4: MiNT Performance with Varying Training Scales. Test AUC of MiNT models trained on 4, 16 and 64 networks and evaluated on unseen test datasets. We compare the performance with persistence forecast, and HTGN models trained and tested on each dataset.
Figure 5: Effect of Data Selection on model performance.
...and 4 more figures

Theorems & Definitions (1)

Definition 1: Discrete Time Dynamic Graphs

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

TL;DR

Abstract

MiNT: Multi-Network Training for Transfer Learning on Temporal Graphs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (1)