Table of Contents
Fetching ...

CLEAR: Cluster-based Prompt Learning on Heterogeneous Graphs

Feiyang Wang, Zhongbao Zhang, Junda Ye, Li Sun, Jianzhong Qi

TL;DR

This work tackles the challenge of aligning pretext tasks with downstream objectives in heterogeneous graphs, where traditional prompts often ignore structural and semantic information. It introduces CLEAR, a three-component framework that combines dual-view pre-training (structural and semantic) with cluster-based prompts and a meta-path template to guide prompting, all tied to a unified objective via losses such as $\mathcal{L}_{pre}$ and $\mathcal{L}_{prompt}$. Cluster prompts treat clustering outcomes as learnable tokens connected to target nodes, aided by an orthogonal constraint to promote diverse prompts, effectively reframing node-level tasks as link prediction to prompts. The meta-path template further injects high-order semantics by constructing prompt-aware meta-paths and employing a discriminator with adjacency-guided sampling for self-adversarial training. Empirical results on ACM, DBLP, and IMDB demonstrate CLEAR achieving up to $\approx$5% improvements in F1 for node classification and superior clustering performance, validating the approach's ability to bridge pretext and downstream objectives in heterogeneous graphs.

Abstract

Prompt learning has attracted increasing attention in the graph domain as a means to bridge the gap between pretext and downstream tasks. Existing studies on heterogeneous graph prompting typically use feature prompts to modify node features for specific downstream tasks, which do not concern the structure of heterogeneous graphs. Such a design also overlooks information from the meta-paths, which are core to learning the high-order semantics of the heterogeneous graphs. To address these issues, we propose CLEAR, a Cluster-based prompt LEARNING model on heterogeneous graphs. We present cluster prompts that reformulate downstream tasks as heterogeneous graph reconstruction. In this way, we align the pretext and downstream tasks to share the same training objective. Additionally, our cluster prompts are also injected into the meta-paths such that the prompt learning process incorporates high-order semantic information entailed by the meta-paths. Extensive experiments on downstream tasks confirm the superiority of CLEAR. It consistently outperforms state-of-the-art models, achieving up to 5% improvement on the F1 metric for node classification.

CLEAR: Cluster-based Prompt Learning on Heterogeneous Graphs

TL;DR

This work tackles the challenge of aligning pretext tasks with downstream objectives in heterogeneous graphs, where traditional prompts often ignore structural and semantic information. It introduces CLEAR, a three-component framework that combines dual-view pre-training (structural and semantic) with cluster-based prompts and a meta-path template to guide prompting, all tied to a unified objective via losses such as and . Cluster prompts treat clustering outcomes as learnable tokens connected to target nodes, aided by an orthogonal constraint to promote diverse prompts, effectively reframing node-level tasks as link prediction to prompts. The meta-path template further injects high-order semantics by constructing prompt-aware meta-paths and employing a discriminator with adjacency-guided sampling for self-adversarial training. Empirical results on ACM, DBLP, and IMDB demonstrate CLEAR achieving up to 5% improvements in F1 for node classification and superior clustering performance, validating the approach's ability to bridge pretext and downstream objectives in heterogeneous graphs.

Abstract

Prompt learning has attracted increasing attention in the graph domain as a means to bridge the gap between pretext and downstream tasks. Existing studies on heterogeneous graph prompting typically use feature prompts to modify node features for specific downstream tasks, which do not concern the structure of heterogeneous graphs. Such a design also overlooks information from the meta-paths, which are core to learning the high-order semantics of the heterogeneous graphs. To address these issues, we propose CLEAR, a Cluster-based prompt LEARNING model on heterogeneous graphs. We present cluster prompts that reformulate downstream tasks as heterogeneous graph reconstruction. In this way, we align the pretext and downstream tasks to share the same training objective. Additionally, our cluster prompts are also injected into the meta-paths such that the prompt learning process incorporates high-order semantic information entailed by the meta-paths. Extensive experiments on downstream tasks confirm the superiority of CLEAR. It consistently outperforms state-of-the-art models, achieving up to 5% improvement on the F1 metric for node classification.

Paper Structure

This paper contains 17 sections, 15 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Cluster prompts in heterogeneous graphs. "Paper" is the target node and "Prompt" represents the cluster.
  • Figure 2: Overall framework of CLEAR. The pre-training module learns structural and semantic features of heterogeneous graphs within a contrastive framework. The prompt module introduces cluster prompts to learn clustering features, also trained within the same contrastive framework. The template module integrates prompts into meta-paths to enhance prompt learning with high-order semantics. Parameters are transferred from the pre-training module to the prompt module and template module.
  • Figure 3: Model performance w.r.t. number of shots on node classification task
  • Figure 4: Visualization of node embeddings under the zero-shot setting on DBLP

Theorems & Definitions (2)

  • definition thmcounterdefinition
  • definition thmcounterdefinition