Scaling GraphLLM with Bilevel-Optimized Sparse Querying
Yangzhe Peng, Haiquan Qiu, Quanming Yao, Kun He
TL;DR
Bilevel-Optimized Sparse Querying (BOSQ) tackles the scalability bottleneck of GraphLLMs on Text-Attributed Graphs by learning a sparse, task-driven querying policy that limits LLM calls to a fixed budget $K$ of nodes. The method formulates a bilevel optimization: an outer layer learns node importances $\boldsymbol{\lambda}$ to select informative nodes, while an inner layer optimizes a GNN with the augmented node features; a differentiable top-$K$ mask $\mathbf{m}$ is obtained via a straight-through estimator, and LLM/LM parameters are kept frozen. Hypergradients are computed through the Implicit Function Theorem with a Neumann-series approximation, enabling end-to-end learning with a complexity dominated by $O(K\cdot C_{\text{LLM}})$. Empirical results on six real-world TAGs show BOSQ delivers orders-of-magnitude speedups over dense GraphLLMs and scalable performance, including successful million-scale evaluation on ogbn-products with competitive accuracy and transferable node-importance scores across model sizes. This work makes practical GraphLLMs feasible for large-scale TAGs by efficiently leveraging LLM explanations without sacrificing task performance.
Abstract
LLMs have recently shown strong potential in enhancing node-level tasks on text-attributed graphs (TAGs) by providing explanation features. However, their practical use is severely limited by the high computational and monetary cost of repeated LLM queries. To illustrate, naively generating explanations for all nodes on a medium-sized benchmark like Photo (48k nodes) using a representative method (e.g., TAPE) would consume days of processing time. In this paper, we propose Bilevel-Optimized Sparse Querying (BOSQ), a general framework that selectively leverages LLM-derived explanation features to enhance performance on node-level tasks on TAGs. We design an adaptive sparse querying strategy that selectively decides when to invoke LLMs, avoiding redundant or low-gain queries and significantly reducing computation overhead. Extensive experiments on six real-world TAG datasets involving two types of node-level tasks demonstrate that BOSQ achieves orders of magnitude speedups over existing GraphLLM methods while consistently delivering on-par or superior performance.
