Table of Contents
Fetching ...

Towards A Universal Graph Structural Encoder

Jialin Chen, Haolan Zuo, Haoyu Peter Wang, Siqi Miao, Pan Li, Rex Ying

TL;DR

GFSE introduces a universal graph structural encoder pre-trained across multiple domains with four self-supervised objectives, leveraging a Graph Transformer with biased attention to produce expressive positional and structural encodings (PSE). By integrating relative structural information via random-walk-based encodings and SEG-WL inspired expressiveness, GFSE achieves strong cross-domain transfer, improving downstream GNNs and enabling seamless augmentation of text-attributed graphs and LLMs. Empirical results across synthetic and real-world datasets show GFSE delivers robust gains, including state-of-the-art performance in many settings and notable improvements in molecular and large-scale graph tasks. The work demonstrates the practicality of a domain-agnostic graph foundation model that reduces task-specific fine-tuning and supports integration with downstream feature encoders and language models for broad applicability.

Abstract

Recent advancements in large-scale pre-training have shown the potential to learn generalizable representations for downstream tasks. In the graph domain, however, capturing and transferring structural information across different graph domains remains challenging, primarily due to the inherent differences in topological patterns across various contexts. Additionally, most existing models struggle to capture the complexity of rich graph structures, leading to inadequate exploration of the embedding space. To address these challenges, we propose GFSE, a universal graph structural encoder designed to capture transferable structural patterns across diverse domains such as molecular graphs, social networks, and citation networks. GFSE is the first cross-domain graph structural encoder pre-trained with multiple self-supervised learning objectives. Built on a Graph Transformer, GFSE incorporates attention mechanisms informed by graph inductive bias, enabling it to encode intricate multi-level and fine-grained topological features. The pre-trained GFSE produces generic and theoretically expressive positional and structural encoding for graphs, which can be seamlessly integrated with various downstream graph feature encoders, including graph neural networks for vectorized features and Large Language Models for text-attributed graphs. Comprehensive experiments on synthetic and real-world datasets demonstrate GFSE's capability to significantly enhance the model's performance while requiring substantially less task-specific fine-tuning. Notably, GFSE achieves state-of-the-art performance in 81.6% evaluated cases, spanning diverse graph models and datasets, highlighting its potential as a powerful and versatile encoder for graph-structured data.

Towards A Universal Graph Structural Encoder

TL;DR

GFSE introduces a universal graph structural encoder pre-trained across multiple domains with four self-supervised objectives, leveraging a Graph Transformer with biased attention to produce expressive positional and structural encodings (PSE). By integrating relative structural information via random-walk-based encodings and SEG-WL inspired expressiveness, GFSE achieves strong cross-domain transfer, improving downstream GNNs and enabling seamless augmentation of text-attributed graphs and LLMs. Empirical results across synthetic and real-world datasets show GFSE delivers robust gains, including state-of-the-art performance in many settings and notable improvements in molecular and large-scale graph tasks. The work demonstrates the practicality of a domain-agnostic graph foundation model that reduces task-specific fine-tuning and supports integration with downstream feature encoders and language models for broad applicability.

Abstract

Recent advancements in large-scale pre-training have shown the potential to learn generalizable representations for downstream tasks. In the graph domain, however, capturing and transferring structural information across different graph domains remains challenging, primarily due to the inherent differences in topological patterns across various contexts. Additionally, most existing models struggle to capture the complexity of rich graph structures, leading to inadequate exploration of the embedding space. To address these challenges, we propose GFSE, a universal graph structural encoder designed to capture transferable structural patterns across diverse domains such as molecular graphs, social networks, and citation networks. GFSE is the first cross-domain graph structural encoder pre-trained with multiple self-supervised learning objectives. Built on a Graph Transformer, GFSE incorporates attention mechanisms informed by graph inductive bias, enabling it to encode intricate multi-level and fine-grained topological features. The pre-trained GFSE produces generic and theoretically expressive positional and structural encoding for graphs, which can be seamlessly integrated with various downstream graph feature encoders, including graph neural networks for vectorized features and Large Language Models for text-attributed graphs. Comprehensive experiments on synthetic and real-world datasets demonstrate GFSE's capability to significantly enhance the model's performance while requiring substantially less task-specific fine-tuning. Notably, GFSE achieves state-of-the-art performance in 81.6% evaluated cases, spanning diverse graph models and datasets, highlighting its potential as a powerful and versatile encoder for graph-structured data.

Paper Structure

This paper contains 39 sections, 5 theorems, 13 equations, 6 figures, 14 tables.

Key Result

Proposition 3.1

RW($d$)-SEG-WL ($d\geq 3$) is strictly more expressive than 1-WL in testing non-isomorphic graphs.

Figures (6)

  • Figure 1: A) GFSE is pre-trained on 8 datasets from 6 different domains. Pre-training tasks include reconstruction (shortest path distance regression, motif counting) and contrastive learning (community detection, graph contrastive learning). B) GFSE generates generic and expressive Positional and Structural Encoding (PSE) to tackle topological graph tasks. GFSE can also be seamlessly integrated into downstream feature encoders for feature-enriched tasks by concatenating with initial vectorized features or prepending the generated PSE to the textual prompt as a soft token.
  • Figure 2: Pre-training performance with different architectures. TF is the abbreviation of transformer.
  • Figure 3: (a) Learning task uncertainty ($\sigma^2$) w.r.t. pre-training. (b) Performance of GFSE pre-trained on different data sizes. Evaluation datasets are MolPCBA and Arxiv. The base model is GPS.
  • Figure 4: Illustration of various subgraphs (graphlets) used in the motif counting. Each subgraph is indexed and labeled for reference.
  • Figure 5: Biased Attention based on random walk matrix.
  • ...and 1 more figures

Theorems & Definitions (6)

  • Proposition 3.1
  • Proposition 3.2
  • Proposition 4.1
  • Proposition 4.2
  • proof
  • Proposition 4.3