Table of Contents
Fetching ...

AnyGraph: Graph Foundation Model in the Wild

Lianghao Xia, Chao Huang

TL;DR

AnyGraph tackles the challenge of cross-domain generalization in graph data by introducing a graph foundation model built on a Mixture-of-Experts (MoE) framework with a lightweight routing mechanism. By training a diverse set of graph experts and dynamically routing inputs to the most competent expert, AnyGraph effectively handles both structure and feature heterogeneity while enabling fast adaptation and emergent scaling behavior. Extensive evaluation across 38 datasets demonstrates strong zero-shot performance, scalable improvements with model size and data, and efficiency advantages over traditional fine-tuning approaches. The work highlights the practical impact of cross-domain graph foundation models for diverse applications and provides insights into the interpretability of expert routing and the value of representation augmentation.

Abstract

The growing ubiquity of relational data structured as graphs has underscored the need for graph learning models with exceptional generalization capabilities. However, current approaches often struggle to effectively extract generalizable insights, frequently requiring extensive fine-tuning and limiting their versatility. Graph foundation models offer a transformative solution, with the potential to learn robust, generalizable representations from graph data. This enables more effective and adaptable applications across a wide spectrum of tasks and domains. In this work, we investigate a unified graph model, AnyGraph, designed to handle key challenges: i) Structure Heterogenity. Addressing distribution shift in graph structural information; ii) Feature Heterogenity. Handling diverse feature representation spaces across graph datasets; iii) Fast Adaptation. Efficiently adapting the model to new graph domains; iv) Scaling Law Emergence. Enabling the model to exhibit scaling law behavior, where its performance scales favorably with the amount of data and parameter sizes. To tackle these critical challenges, we build the AnyGraph upon a Graph Mixture-of-Experts (MoE) architecture. This approach empowers the model to effectively manage both the in-domain and cross-domain distribution shift concerning structure-level and feature-level heterogeneity. Furthermore, a lightweight graph expert routing mechanism is proposed to facilitate AnyGraph's fast adaptability to new data and domains. Our extensive experiments on diverse 38 graph datasets have demonstrated the strong zero-shot learning performance of AnyGraph across diverse graph domains with significant distribution shift. Furthermore, we have validated the model's fast adaptation ability and scaling law emergence, showcasing its versatility.

AnyGraph: Graph Foundation Model in the Wild

TL;DR

AnyGraph tackles the challenge of cross-domain generalization in graph data by introducing a graph foundation model built on a Mixture-of-Experts (MoE) framework with a lightweight routing mechanism. By training a diverse set of graph experts and dynamically routing inputs to the most competent expert, AnyGraph effectively handles both structure and feature heterogeneity while enabling fast adaptation and emergent scaling behavior. Extensive evaluation across 38 datasets demonstrates strong zero-shot performance, scalable improvements with model size and data, and efficiency advantages over traditional fine-tuning approaches. The work highlights the practical impact of cross-domain graph foundation models for diverse applications and provides insights into the interpretability of expert routing and the value of representation augmentation.

Abstract

The growing ubiquity of relational data structured as graphs has underscored the need for graph learning models with exceptional generalization capabilities. However, current approaches often struggle to effectively extract generalizable insights, frequently requiring extensive fine-tuning and limiting their versatility. Graph foundation models offer a transformative solution, with the potential to learn robust, generalizable representations from graph data. This enables more effective and adaptable applications across a wide spectrum of tasks and domains. In this work, we investigate a unified graph model, AnyGraph, designed to handle key challenges: i) Structure Heterogenity. Addressing distribution shift in graph structural information; ii) Feature Heterogenity. Handling diverse feature representation spaces across graph datasets; iii) Fast Adaptation. Efficiently adapting the model to new graph domains; iv) Scaling Law Emergence. Enabling the model to exhibit scaling law behavior, where its performance scales favorably with the amount of data and parameter sizes. To tackle these critical challenges, we build the AnyGraph upon a Graph Mixture-of-Experts (MoE) architecture. This approach empowers the model to effectively manage both the in-domain and cross-domain distribution shift concerning structure-level and feature-level heterogeneity. Furthermore, a lightweight graph expert routing mechanism is proposed to facilitate AnyGraph's fast adaptability to new data and domains. Our extensive experiments on diverse 38 graph datasets have demonstrated the strong zero-shot learning performance of AnyGraph across diverse graph domains with significant distribution shift. Furthermore, we have validated the model's fast adaptation ability and scaling law emergence, showcasing its versatility.
Paper Structure (30 sections, 9 equations, 6 figures, 4 tables)

This paper contains 30 sections, 9 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: AnyGraph's generalizability and scaling law reveals its exceptional capabilities. Compared to baseline methods, the superior performance of AnyGraph can be observed in its exceptional cross-domain generalization ability.
  • Figure 2: The proposed graph Mixture-of-Experts (MoE) paradigm enables AnyGraph to learn a diverse ensemble of graph experts, each tailored to specific structural characteristics. The lightweight expert routing mechanism allows AnyGraph to quickly identify and activate the most relevant experts for a given input graph, without extensive retraining or fine-tuning.
  • Figure 3: Zero-shot and full-shot performance w.r.t. the number of model parameters and the amount of training samples.
  • Figure 4: Impact of different sub-modules on the zero-shot and full-shot prediction capabilities of AnyGraph.
  • Figure 5: Competence score between datasets and expert models, given by the routing mechanism of AnyGraph.
  • ...and 1 more figures