Table of Contents
Fetching ...

GMoPE:A Prompt-Expert Mixture Framework for Graph Foundation Models

Zhibin Wang, Zhixing Zhang, Shuqi Wang, Xuanting Xie, Zhao Kang

TL;DR

This work tackles the challenge of cross-domain generalization in graph foundation models by fusing graph prompting with a Mixture-of-Experts (MoE) architecture. Each expert receives a unique prompt, and a structure-aware router selects and aggregates expert outputs, augmented by a soft orthogonality constraint to sustain diversity. Crucially, transfer learning is lightweight, as only prompts and task heads are fine-tuned while core experts remain fixed. Empirical results show GMoPE outperforms state-of-the-art prompt-based and MoE baselines across link prediction and graph/node classification tasks, often matching or surpassing full-parameter fine-tuning with substantially lower adaptation overhead. The approach offers a principled, scalable pathway toward robust, generalizable graph foundation models suitable for diverse domains.

Abstract

Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing approaches often struggle with negative transfer, scalability issues, and high adaptation costs. To address these challenges, we propose GMoPE (Graph Mixture of Prompt-Experts), a novel framework that seamlessly integrates the Mixture-of-Experts (MoE) architecture with prompt-based learning for graphs. GMoPE leverages expert-specific prompt vectors and structure-aware MoE routing to enable each expert to specialize in distinct subdomains and dynamically contribute to predictions. To promote diversity and prevent expert collapse, we introduce a soft orthogonality constraint across prompt vectors, encouraging expert specialization and facilitating a more balanced expert utilization. Additionally, we adopt a prompt-only fine-tuning strategy that significantly reduces spatiotemporal complexity during transfer. We validate GMoPE through extensive experiments under various pretraining strategies and multiple downstream tasks. Results show that GMoPE consistently outperforms state-of-the-art baselines and achieves performance comparable to full parameter fine-tuning-while requiring only a fraction of the adaptation overhead. Our work provides a principled and scalable framework for advancing generalizable and efficient graph foundation models.

GMoPE:A Prompt-Expert Mixture Framework for Graph Foundation Models

TL;DR

This work tackles the challenge of cross-domain generalization in graph foundation models by fusing graph prompting with a Mixture-of-Experts (MoE) architecture. Each expert receives a unique prompt, and a structure-aware router selects and aggregates expert outputs, augmented by a soft orthogonality constraint to sustain diversity. Crucially, transfer learning is lightweight, as only prompts and task heads are fine-tuned while core experts remain fixed. Empirical results show GMoPE outperforms state-of-the-art prompt-based and MoE baselines across link prediction and graph/node classification tasks, often matching or surpassing full-parameter fine-tuning with substantially lower adaptation overhead. The approach offers a principled, scalable pathway toward robust, generalizable graph foundation models suitable for diverse domains.

Abstract

Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing approaches often struggle with negative transfer, scalability issues, and high adaptation costs. To address these challenges, we propose GMoPE (Graph Mixture of Prompt-Experts), a novel framework that seamlessly integrates the Mixture-of-Experts (MoE) architecture with prompt-based learning for graphs. GMoPE leverages expert-specific prompt vectors and structure-aware MoE routing to enable each expert to specialize in distinct subdomains and dynamically contribute to predictions. To promote diversity and prevent expert collapse, we introduce a soft orthogonality constraint across prompt vectors, encouraging expert specialization and facilitating a more balanced expert utilization. Additionally, we adopt a prompt-only fine-tuning strategy that significantly reduces spatiotemporal complexity during transfer. We validate GMoPE through extensive experiments under various pretraining strategies and multiple downstream tasks. Results show that GMoPE consistently outperforms state-of-the-art baselines and achieves performance comparable to full parameter fine-tuning-while requiring only a fraction of the adaptation overhead. Our work provides a principled and scalable framework for advancing generalizable and efficient graph foundation models.

Paper Structure

This paper contains 47 sections, 1 theorem, 26 equations, 6 figures, 6 tables.

Key Result

Theorem 1

Let $\mathcal{F}_{\mathrm{GPF}}$ and $\mathcal{F}_{\mathrm{GMoPE}}$ denote the function classes induced by GPF and GMoPE, respectively. Then the following conclusion holds:

Figures (6)

  • Figure 1: Guide experts with prompts
  • Figure 2: Overview of the proposed GMoPE framework. (a) Pre-training phase: All model parameters are optimized jointly, guided by structure-aware MoE routing and a soft orthogonality loss to encourage expert diversity. (b) Transfer learning phase: Expert parameters are frozen, and only task-specific prompts are fine-tuned, enabling efficient and modular adaptation to new domains. (c) Inference phase: Expert outputs are aggregated based on learned routing to produce the final graph representations for downstream tasks.
  • Figure 3: The performance impact of soft orthogonal loss on different downstream tasks.
  • Figure 4: Node classification performance under different values of M and K (on citation networks,using DGI pre training strategy)
  • Figure 5: The relationship between soft orthogonal loss and soft router weight allocation
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 1