Table of Contents
Fetching ...

A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective

Ziwen Zhao, Yixin Su, Yuhua Li, Yixiong Zou, Ruixuan Li, Rui Zhang

TL;DR

This survey introduces a knowledge-based taxonomy for self-supervised graph foundation models (GFMs), organizing graph knowledge into microscopic, mesoscopic, and macroscopic categories and mapping them to over $25$ pretext tasks across GNN/GT/GLM architectures. It unifies pre-training and downstream tuning under a common framework, highlights representative methods, and analyzes efficiency, scalability, and applicability. The work also surveys self-supervised graph language models, detailing how GLMs integrate graph knowledge through pre-training and how prompting and PEFT enable efficient downstream adaptation. Challenging directions include combining knowledge patterns, cross-graph-type adaptation, bias robustness, and explainable reasoning with RoG and RAG approaches, signaling a roadmap for robust, scalable, and interpretable GFMs. Collectively, the taxonomy and syntheses offer a foundation for designing generalized GFMs that leverage graph-specific knowledge across diverse data modalities and tasks.

Abstract

Graph self-supervised learning (SSL) is now a go-to method for pre-training graph foundation models (GFMs). There is a wide variety of knowledge patterns embedded in the graph data, such as node properties and clusters, which are crucial to learning generalized representations for GFMs. However, existing surveys of GFMs have several shortcomings: they lack comprehensiveness regarding the most recent progress, have unclear categorization of self-supervised methods, and take a limited architecture-based perspective that is restricted to only certain types of graph models. As the ultimate goal of GFMs is to learn generalized graph knowledge, we provide a comprehensive survey of self-supervised GFMs from a novel knowledge-based perspective. We propose a knowledge-based taxonomy, which categorizes self-supervised graph models by the specific graph knowledge utilized. Our taxonomy consists of microscopic (nodes, links, etc.), mesoscopic (context, clusters, etc.), and macroscopic knowledge (global structure, manifolds, etc.). It covers a total of 9 knowledge categories and more than 25 pretext tasks for pre-training GFMs, as well as various downstream task generalization strategies. Such a knowledge-based taxonomy allows us to re-examine graph models based on new architectures more clearly, such as graph language models, as well as provide more in-depth insights for constructing GFMs.

A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective

TL;DR

This survey introduces a knowledge-based taxonomy for self-supervised graph foundation models (GFMs), organizing graph knowledge into microscopic, mesoscopic, and macroscopic categories and mapping them to over pretext tasks across GNN/GT/GLM architectures. It unifies pre-training and downstream tuning under a common framework, highlights representative methods, and analyzes efficiency, scalability, and applicability. The work also surveys self-supervised graph language models, detailing how GLMs integrate graph knowledge through pre-training and how prompting and PEFT enable efficient downstream adaptation. Challenging directions include combining knowledge patterns, cross-graph-type adaptation, bias robustness, and explainable reasoning with RoG and RAG approaches, signaling a roadmap for robust, scalable, and interpretable GFMs. Collectively, the taxonomy and syntheses offer a foundation for designing generalized GFMs that leverage graph-specific knowledge across diverse data modalities and tasks.

Abstract

Graph self-supervised learning (SSL) is now a go-to method for pre-training graph foundation models (GFMs). There is a wide variety of knowledge patterns embedded in the graph data, such as node properties and clusters, which are crucial to learning generalized representations for GFMs. However, existing surveys of GFMs have several shortcomings: they lack comprehensiveness regarding the most recent progress, have unclear categorization of self-supervised methods, and take a limited architecture-based perspective that is restricted to only certain types of graph models. As the ultimate goal of GFMs is to learn generalized graph knowledge, we provide a comprehensive survey of self-supervised GFMs from a novel knowledge-based perspective. We propose a knowledge-based taxonomy, which categorizes self-supervised graph models by the specific graph knowledge utilized. Our taxonomy consists of microscopic (nodes, links, etc.), mesoscopic (context, clusters, etc.), and macroscopic knowledge (global structure, manifolds, etc.). It covers a total of 9 knowledge categories and more than 25 pretext tasks for pre-training GFMs, as well as various downstream task generalization strategies. Such a knowledge-based taxonomy allows us to re-examine graph models based on new architectures more clearly, such as graph language models, as well as provide more in-depth insights for constructing GFMs.
Paper Structure (30 sections, 10 equations, 9 figures, 7 tables)

This paper contains 30 sections, 10 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: How self-supervised GFMs are believed to work: pre-training and downstream tuning. Updating the pre-trained model during downstream tuning is optional depending on the tuning strategy.
  • Figure 2: Our knowledge-based taxonomy of self-supervised graph pre-training with representative literature.
  • Figure 3: An illustration of discrimination tasks between node features. The similarity is defined as the dot-product between embeddings $\mathbf{Z}$. Node instance discrimination performs row-wise contrast, which computes the similarity between every pair of node embeddings (distinguished by the node shape: circle/triangle/square). Dimension discrimination performs column-wise contrast, which computes the similarity between every pair of node dimensions (distinguished by the embedding color: orange/red).
  • Figure 4: An illustration of the contextual knowledge. For the central node of a 2-hop subgraph (Left), context discrimination often takes its neighboring nodes or as positive samples and other nodes as negative samples. Contextual subgraph discrimination samples multiple contextual subgraphs (Right) as positive pairs, while negative ones are sampled from other subgraphs.
  • Figure 5: A molecular graph is converted into a fragment graph by aggregating functional groups into supernodes. For motif prediction, each supernode embedding is matched with a prototype in a motif dictionary.
  • ...and 4 more figures