ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs
Yuhan Li, Peisong Wang, Zhixun Li, Jeffrey Xu Yu, Jia Li
TL;DR
The paper tackles the problem of cross-dataset zero-shot transfer for graph node classification by unifying node and class representations in a language-model-derived semantic space, and by enriching pre-training data through a prompt-based subgraph sampling strategy. It introduces ZeroG, which combines a LM-based unified representation, dataset-promoting prompting nodes, and a parameter-efficient LoRA pre-training regime to enable robust zero-shot generalization across heterogeneous graphs. Key contributions include a comprehensive analysis of cross-dataset zero-shot transfer, an architecture that significantly improves in-domain and cross-domain transfer on seven benchmarks, and an ablation study validating the importance of each component. The approach advances graph foundation model research, offering a publicly available codebase and demonstrating practical potential for generalizing graph reasoning without dataset-specific fine-tuning.
Abstract
With the development of foundation models such as large language models, zero-shot transfer learning has become increasingly significant. This is highlighted by the generative capabilities of NLP models like GPT-4, and the retrieval-based approaches of CV models like CLIP, both of which effectively bridge the gap between seen and unseen data. In the realm of graph learning, the continuous emergence of new graphs and the challenges of human labeling also amplify the necessity for zero-shot transfer learning, driving the exploration of approaches that can generalize across diverse graph data without necessitating dataset-specific and label-specific fine-tuning. In this study, we extend such paradigms to zero-shot transferability in graphs by introducing ZeroG, a new framework tailored to enable cross-dataset generalization. Addressing the inherent challenges such as feature misalignment, mismatched label spaces, and negative transfer, we leverage a language model to encode both node attributes and class semantics, ensuring consistent feature dimensions across datasets. We also propose a prompt-based subgraph sampling module that enriches the semantic information and structure information of extracted subgraphs using prompting nodes and neighborhood aggregation, respectively. We further adopt a lightweight fine-tuning strategy that reduces the risk of overfitting and maintains the zero-shot learning efficacy of the language model. The results underscore the effectiveness of our model in achieving significant cross-dataset zero-shot transferability, opening pathways for the development of graph foundation models. Codes and data are available at https://github.com/NineAbyss/ZeroG.
