Table of Contents
Fetching ...

A Survey of Cross-domain Graph Learning: Progress and Future Directions

Haihong Zhao, Zhixun Li, Chenyi Zi, Aochuan Chen, Fugee Tsung, Jia Li, Jeffrey Xu Yu

TL;DR

The paper addresses the challenge of transferring graph knowledge across heterogeneous domains to realize true graph foundation models. It introduces a threefold taxonomy—structure-oriented, feature-oriented, and mixture-oriented CDGL—across three cross-domain scales (Limited, Conditional, Open) and three difficulty levels (High, Moderate, Low). By systematically reviewing representative methods and outlining refinements for scale and difficulty, the work highlights open problems such as feature alignment, data scale, and evaluation benchmarks, while proposing future directions to achieve robust open cross-domain graph generalization. The survey emphasizes the potential of integrating structure and semantics, including recent advances with LLMs and cross-domain pretraining, to enable truly domain-agnostic graph representations with practical impact on diverse applications. Overall, the paper maps the landscape, identifies gaps, and guides researchers toward developing graph foundation models capable of generalizing across domains.

Abstract

Graph learning plays a vital role in mining and analyzing complex relationships within graph data and has been widely applied to real-world scenarios such as social, citation, and e-commerce networks. Foundation models in computer vision (CV) and natural language processing (NLP) have demonstrated remarkable cross-domain capabilities that are equally significant for graph data. However, existing graph learning approaches often struggle to generalize across domains. Motivated by recent advances in CV and NLP, cross-domain graph learning (CDGL) has gained renewed attention as a promising step toward realizing true graph foundation models. In this survey, we provide a comprehensive review and analysis of existing works on CDGL. We propose a new taxonomy that categorizes existing approaches according to the type of transferable knowledge learned across domains: structure-oriented, feature-oriented, and mixture-oriented. Based on this taxonomy, we systematically summarize representative methods in each category, discuss the key challenges and limitations of current studies, and outline promising directions for future research. A continuously updated collection of related works is available at: https://github.com/cshhzhao/Awesome-Cross-Domain-Graph-Learning.

A Survey of Cross-domain Graph Learning: Progress and Future Directions

TL;DR

The paper addresses the challenge of transferring graph knowledge across heterogeneous domains to realize true graph foundation models. It introduces a threefold taxonomy—structure-oriented, feature-oriented, and mixture-oriented CDGL—across three cross-domain scales (Limited, Conditional, Open) and three difficulty levels (High, Moderate, Low). By systematically reviewing representative methods and outlining refinements for scale and difficulty, the work highlights open problems such as feature alignment, data scale, and evaluation benchmarks, while proposing future directions to achieve robust open cross-domain graph generalization. The survey emphasizes the potential of integrating structure and semantics, including recent advances with LLMs and cross-domain pretraining, to enable truly domain-agnostic graph representations with practical impact on diverse applications. Overall, the paper maps the landscape, identifies gaps, and guides researchers toward developing graph foundation models capable of generalizing across domains.

Abstract

Graph learning plays a vital role in mining and analyzing complex relationships within graph data and has been widely applied to real-world scenarios such as social, citation, and e-commerce networks. Foundation models in computer vision (CV) and natural language processing (NLP) have demonstrated remarkable cross-domain capabilities that are equally significant for graph data. However, existing graph learning approaches often struggle to generalize across domains. Motivated by recent advances in CV and NLP, cross-domain graph learning (CDGL) has gained renewed attention as a promising step toward realizing true graph foundation models. In this survey, we provide a comprehensive review and analysis of existing works on CDGL. We propose a new taxonomy that categorizes existing approaches according to the type of transferable knowledge learned across domains: structure-oriented, feature-oriented, and mixture-oriented. Based on this taxonomy, we systematically summarize representative methods in each category, discuss the key challenges and limitations of current studies, and outline promising directions for future research. A continuously updated collection of related works is available at: https://github.com/cshhzhao/Awesome-Cross-Domain-Graph-Learning.

Paper Structure

This paper contains 33 sections, 11 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Cross-domain graph learning aims to integrate knowledge from multi-domain source graphs and transfer it to diverse target domains.
  • Figure 2: A taxonomy of graph models for solving cross-domain graph learning with representative examples.
  • Figure 3: Structure-oriented CDGL: (a) a shared generator produces diverse structures for both source and target domains; (b) structure contrast explicitly or implicitly constructs positive–negative pairs in the source domain to improve generalization to target domains.
  • Figure 4: Feature-oriented CDGL: a) tackles the feature semantic alignment across domains; b) further aligns the feature dimensions across various graph domains to achieve more general CDGL. Prompt techniques are optional for both approaches.
  • Figure 5: Mixture-oriented CDGL. (a) Feature-Structure Mixture: first extracts transferable features, then integrates structural knowledge. (b) Structure-Feature Mixture: first extracts shared structural patterns, then incorporates feature information. (c) GNN-based Unified Mixture: a unified extractor jointly learns cross-domain structural and feature patterns. (d) Flatten-based Unified Mixture: graphs are flattened into token sequences, leveraging the powerful reasoning ability of LLMs to align and extract shared cross-domain knowledge. (a)-(b) follow a sequential-mixture integration paradigm, whereas (c)-(d) adopt a unified-mixture strategy. The red flame icon denotes trainable modules.