Query-Aware Learnable Graph Pooling Tokens as Prompt for Large Language Models
Wooyoung Kim, Byungyoon Park, Wooju Kim
TL;DR
This paper addresses the challenge of encoding text-attributed graphs for large language models by introducing Learnable Graph Pooling Tokens (LGPT), a set of trainable tokens that balance fine-grained node information with global graph context. It further proposes Early Query Fusion, which integrates query context before graph embedding, yielding more query-tailored representations. Together, LGPT and Early Query Fusion improve Graph QA performance, achieving an average gain of $4.13\%$ on GraphQA without training the LLM and demonstrating robustness under LoRA-based LLM training. The approach offers a scalable alternative to node-level and single-vector graph representations, maintaining low complexity while reducing information loss in graph-to-text prompting. These findings advance practical graph reasoning with LLMs in diverse domains, including scene graphs and knowledge graphs.
Abstract
Graph-structured data plays a vital role in numerous domains, such as social networks, citation networks, commonsense reasoning graphs and knowledge graphs. While graph neural networks have been employed for graph processing, recent advancements have explored integrating large language models for graph-based tasks. In this paper, we propose a novel approach named Learnable Graph Pooling Token (LGPT), which addresses the limitations of the scalability issues in node-level projection and information loss in graph-level projection. LGPT enables flexible and efficient graph representation by introducing learnable parameters that act as tokens in large language models, balancing fine-grained and global graph information. Additionally, we investigate an Early Query Fusion technique, which fuses query context before constructing the graph representation, leading to more effective graph embeddings. Our method achieves a 4.13\% performance improvement on the GraphQA benchmark without training the large language model, demonstrating significant gains in handling complex textual-attributed graph data.
