Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

Chundong Liang; Yongqi Huang; Dongxiao He; Peiyuan Li; Yawen Li; Di Jin; Weixiong Zhang

Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

Chundong Liang, Yongqi Huang, Dongxiao He, Peiyuan Li, Yawen Li, Di Jin, Weixiong Zhang

TL;DR

It is empirically demonstrate that a balanced mixture of homogeneous and heterogeneous graph pre-training benefits downstream tasks and proposed a unified multi-domain GPH 2 method enables stable transfer across graph types and domains, significantly outperforming existing graph pre-training methods.

Abstract

Graph pre-training has achieved remarkable success in recent years, delivering transferable representations for downstream adaptation. However, most existing methods are designed for either homogeneous or heterogeneous graphs, thereby hindering unified graph modeling across diverse graph types. This separation contradicts real-world applications, where mixed homogeneous and heterogeneous graphs are ubiquitous, and distribution shifts between upstream pre-training and downstream deployment are common. In this paper, we empirically demonstrate that a balanced mixture of homogeneous and heterogeneous graph pre-training benefits downstream tasks and propose a unified multi-domain \textbf{G}raph \textbf{P}re-training method across \textbf{H}omogeneous and \textbf{H}eterogeneous graphs ($\mathbf{GPH^{2}}$). To address the lack of a unified encoder for homogeneous and heterogeneous graphs, we propose a Unified Multi-View Graph Construction that simultaneously encodes both without explicit graph-type-specific designs. To cope with the increased cross-domain distribution discrepancies arising from mixed graphs, we introduce domain-specific expert encoding. Each expert is independently pre-trained on a single graph to capture domain-specific knowledge, thereby shielding the pre-training encoder from the adverse effects of cross-domain discrepancies. For downstream tasks, we further design a Task-oriented Expert Fusion Strategy that adaptively integrates multiple experts based on their discriminative strengths. Extensive experiments on mixed graphs demonstrate that $\text{GPH}^{2}$ enables stable transfer across graph types and domains, significantly outperforming existing graph pre-training methods.

Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

TL;DR

Abstract

). To address the lack of a unified encoder for homogeneous and heterogeneous graphs, we propose a Unified Multi-View Graph Construction that simultaneously encodes both without explicit graph-type-specific designs. To cope with the increased cross-domain distribution discrepancies arising from mixed graphs, we introduce domain-specific expert encoding. Each expert is independently pre-trained on a single graph to capture domain-specific knowledge, thereby shielding the pre-training encoder from the adverse effects of cross-domain discrepancies. For downstream tasks, we further design a Task-oriented Expert Fusion Strategy that adaptively integrates multiple experts based on their discriminative strengths. Extensive experiments on mixed graphs demonstrate that

enables stable transfer across graph types and domains, significantly outperforming existing graph pre-training methods.

Paper Structure (40 sections, 18 equations, 7 figures, 6 tables)

This paper contains 40 sections, 18 equations, 7 figures, 6 tables.

Introduction
1.
2.
3.
4.
Problem Definition and Motivational Study
Problem Definition
Motivational Study
Observation 1: Increasing Pre-training Domain Diversity Improves Downstream Performance.
Observation 2: Incorporating Homogeneous Graphs Can Benefit Heterogeneous Downstream Tasks.
Method
Overview
Unified Multi-View Graph Construction
Domain-Specific Expert Encoding
Task-oriented Expert Fusion
...and 25 more sections

Figures (7)

Figure 1: The fragmented progress of graph pre-training across homogeneous and heterogeneous graphs. Their complementary roles may be better suited to real-world mixed graph types and domain shifts.
Figure 2: Performance of 3-shot node classification under different pre-training graphs on ACM and DBLP.
Figure 3: The workflow of $\text{GPH}^{2}$. $\text{GPH}^{2}$ is designed for learning across mix graphs (left). It consists of two stages: multi-domain pre-training (top) and downstream task adaptation (bottom). In the pre-training stage, each source graph is independently processed by a domain-specific expert, and in the downstream stage, all pre-trained experts are transferred for adaptation.
Figure 4: Ablation study of on ACM and Cora.
Figure 5: Attention scores of the Task-Oriented Expert Fusion module on ACM (50-shot node classification). The three colors represent three independent experiments with different graphs for pre-training.
...and 2 more figures

Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

TL;DR

Abstract

Unified Multi-Domain Graph Pre-training for Homogeneous and Heterogeneous Graphs via Domain-Specific Expert Encoding

Authors

TL;DR

Abstract

Table of Contents

Figures (7)