Graph Neural Prompting with Large Language Models
Yijun Tian, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V. Chawla, Panpan Xu
TL;DR
This work tackles the limitations of large language models in grounded knowledge by introducing Graph Neural Prompting (GNP), a plug-and-play framework that retrieves subgraphs from knowledge graphs and encodes them into a trainable soft prompt fed to pre-trained LLMs. GNP uses a GNN encoder to capture graph structure, a cross-modality pooling module to align graph and text, a domain projector to map embeddings into the LLM space, and a self-supervised link-prediction objective to refine relational understanding. Experiments on commonsense and biomedical reasoning with multiple LLM sizes show that GNP provides substantial gains in frozen-LM settings and competitive improvements when tuned, often matching or surpassing full fine-tuning. The results demonstrate that instance-level, graph-informed prompts can effectively inject grounded knowledge into LLMs without large-scale retraining, enabling scalable knowledge-enhanced reasoning across domains.
Abstract
Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various language modeling tasks. However, they still exhibit inherent limitations in precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost. Therefore, how to enhance pre-trained LLMs using grounded knowledge, e.g., retrieval-augmented generation, remains an open question. In this work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. GNP encompasses various designs, including a standard graph neural network encoder, a cross-modality pooling module, a domain projector, and a self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings. Code is available at https://github.com/meettyj/GNP.
