When LLMs meet open-world graph learning: a new perspective for unlabeled data uncertainty
Yanzhe Wen, Xunkai Li, Qi Zhang, Zhu Lei, Guang Zeng, Rong-Hua Li, Guoren Wang
TL;DR
This work tackles unlabeled data uncertainty in text-attributed graphs under open-world conditions by introducing Open-world Graph Assistant (OGA), a fully automated pipeline that combines adaptive label traceability (ALT) with a Graph Label Annotator (GLA). ALT creates a compact, discriminative ontology space by fusing semantic and topology signals through a graph propagation framework and an entropy-based rejection mechanism, underpinned by theoretical guarantees of low rank and bounded uncertainty. GLA uses structure-guided prompts and multi-granularity annotation to distill and fuse labels across communities, dramatically reducing LLM calls while generating coherent unknown-class annotations for model retraining. Across nine diverse datasets, OGA achieves state-of-the-art performance in unknown-class rejection and improves downstream GNN performance when retrained with GLA-generated annotations, demonstrating a practical, scalable approach to open-world graph learning with limited labeling.
Abstract
Recently, large language models (LLMs) have significantly advanced text-attributed graph (TAG) learning. However, existing methods inadequately handle data uncertainty in open-world scenarios, especially concerning limited labeling and unknown-class nodes. Prior solutions typically rely on isolated semantic or structural approaches for unknown-class rejection, lacking effective annotation pipelines. To address these limitations, we propose Open-world Graph Assistant (OGA), an LLM-based framework that combines adaptive label traceability, which integrates semantics and topology for unknown-class rejection, and a graph label annotator to enable model updates using newly annotated nodes. Comprehensive experiments demonstrate OGA's effectiveness and practicality.
