GMoPE:A Prompt-Expert Mixture Framework for Graph Foundation Models
Zhibin Wang, Zhixing Zhang, Shuqi Wang, Xuanting Xie, Zhao Kang
TL;DR
This work tackles the challenge of cross-domain generalization in graph foundation models by fusing graph prompting with a Mixture-of-Experts (MoE) architecture. Each expert receives a unique prompt, and a structure-aware router selects and aggregates expert outputs, augmented by a soft orthogonality constraint to sustain diversity. Crucially, transfer learning is lightweight, as only prompts and task heads are fine-tuned while core experts remain fixed. Empirical results show GMoPE outperforms state-of-the-art prompt-based and MoE baselines across link prediction and graph/node classification tasks, often matching or surpassing full-parameter fine-tuning with substantially lower adaptation overhead. The approach offers a principled, scalable pathway toward robust, generalizable graph foundation models suitable for diverse domains.
Abstract
Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing approaches often struggle with negative transfer, scalability issues, and high adaptation costs. To address these challenges, we propose GMoPE (Graph Mixture of Prompt-Experts), a novel framework that seamlessly integrates the Mixture-of-Experts (MoE) architecture with prompt-based learning for graphs. GMoPE leverages expert-specific prompt vectors and structure-aware MoE routing to enable each expert to specialize in distinct subdomains and dynamically contribute to predictions. To promote diversity and prevent expert collapse, we introduce a soft orthogonality constraint across prompt vectors, encouraging expert specialization and facilitating a more balanced expert utilization. Additionally, we adopt a prompt-only fine-tuning strategy that significantly reduces spatiotemporal complexity during transfer. We validate GMoPE through extensive experiments under various pretraining strategies and multiple downstream tasks. Results show that GMoPE consistently outperforms state-of-the-art baselines and achieves performance comparable to full parameter fine-tuning-while requiring only a fraction of the adaptation overhead. Our work provides a principled and scalable framework for advancing generalizable and efficient graph foundation models.
