Both Topology and Text Matter: Revisiting LLM-guided Out-of-Distribution Detection on Text-attributed Graphs
Yinlin Zhu, Di Wu, Xu Wang, Guocong Quan, Miao Hu
TL;DR
This work tackles OOD detection in text-attributed graphs by bridging topology and textual semantics. It introduces LG-Plug, a plug-and-play framework that first aligns topology- and text-derived representations and then derives consensus-driven OOD exposure through clustered, iterative LLM prompting with a lightweight in-cluster codebook. The obtained OOD exposure acts as a regularizer, enabling seamless integration with existing topology-driven detectors and yielding consistent improvements across six TAG benchmarks, including notable FPR95 reductions and AUROC gains. The approach demonstrates that jointly modeling textual and structural information, with efficient LLM guidance, offers robust, scalable OOD detection for open-world graph applications.
Abstract
Text-attributed graphs (TAGs) associate nodes with textual attributes and graph structure, enabling GNNs to jointly model semantic and structural information. While effective on in-distribution (ID) data, GNNs often encounter out-of-distribution (OOD) nodes with unseen textual or structural patterns in real-world settings, leading to overconfident and erroneous predictions in the absence of reliable OOD detection. Early approaches address this issue from a topology-driven perspective, leveraging neighboring structures to mitigate node-level detection bias. However, these methods typically encode node texts as shallow vector features, failing to fully exploit rich semantic information. In contrast, recent LLM-based approaches generate pseudo OOD priors by leveraging textual knowledge, but they suffer from several limitations: (1) a reliability-informativeness imbalance in the synthesized OOD priors, as the generated OOD exposures either deviate from the true OOD semantics, or introduce non-negligible ID noise, all of which offers limited improvement to detection performance; (2) reliance on specialized architectures, which prevents incorporation of the extensive effective topology-level insights that have been empirically validated in prior work. To this end, we propose LG-Plug, an LLM-Guided Plug-and-play strategy for TAG OOD detection tasks. LG-Plug aligns topology and text representations to produce fine-grained node embeddings, then generates consensus-driven OOD exposure via clustered iterative LLM prompting. Moreover, it leverages lightweight in-cluster codebook and heuristic sampling reduce time cost of LLM querying. The resulting OOD exposure serves as a regularization term to separate ID and OOD nodes, enabling seamless integration with existing detectors.
