Table of Contents
Fetching ...

LLM as Graph Kernel: Rethinking Message Passing on Text-Rich Graphs

Ying Zhang, Hang Yu, Haipeng Zhang, Peng Di

Abstract

Text-rich graphs, which integrate complex structural dependencies with abundant textual information, are ubiquitous yet remain challenging for existing learning paradigms. Conventional methods and even LLM-hybrids compress rich text into static embeddings or summaries before structural reasoning, creating an information bottleneck and detaching updates from the raw content. We argue that in text-rich graphs, the text is not merely a node attribute but the primary medium through which structural relationships are manifested. We introduce RAMP, a Raw-text Anchored Message Passing approach that moves beyond using LLMs as mere feature extractors and instead recasts the LLM itself as a graph-native aggregation operator. RAMP exploits the text-rich nature of the graph via a novel dual-representation scheme: it anchors inference on each node's raw text during each iteration while propagating dynamically optimized messages from neighbors. It further handles both discriminative and generative tasks under a single unified generative formulation. Extensive experiments show that RAMP effectively bridges the gap between graph propagation and deep text reasoning, achieving competitive performance and offering new insights into the role of LLMs as graph kernels for general-purpose graph learning.

LLM as Graph Kernel: Rethinking Message Passing on Text-Rich Graphs

Abstract

Text-rich graphs, which integrate complex structural dependencies with abundant textual information, are ubiquitous yet remain challenging for existing learning paradigms. Conventional methods and even LLM-hybrids compress rich text into static embeddings or summaries before structural reasoning, creating an information bottleneck and detaching updates from the raw content. We argue that in text-rich graphs, the text is not merely a node attribute but the primary medium through which structural relationships are manifested. We introduce RAMP, a Raw-text Anchored Message Passing approach that moves beyond using LLMs as mere feature extractors and instead recasts the LLM itself as a graph-native aggregation operator. RAMP exploits the text-rich nature of the graph via a novel dual-representation scheme: it anchors inference on each node's raw text during each iteration while propagating dynamically optimized messages from neighbors. It further handles both discriminative and generative tasks under a single unified generative formulation. Extensive experiments show that RAMP effectively bridges the gap between graph propagation and deep text reasoning, achieving competitive performance and offering new insights into the role of LLMs as graph kernels for general-purpose graph learning.
Paper Structure (43 sections, 3 equations, 5 figures, 8 tables)

This paper contains 43 sections, 3 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Illustration of message passing paradigms on text-rich graphs. Given a text-rich graph (left) where each node is associated with long textual content, (i) traditional GNNs compress node texts into a compact representation during aggregation, whereas (ii) explicit aggregation paradigm retains original text for LLM reasoning.
  • Figure 2: Architecture of RAMP. Given a text-rich graph and a query, we wrap node contents into token sequence and perform parallel decoding on all nodes to obtain their summaries. Hidden states of summary tokens are stored in a memory table to initialize the next layer, realizing (a) layer-wise message-passing; the final decoder aggregates graph information for (b) the answer generation.
  • Figure 3: Scalability on Cora. (a) Accuracy and absolute inference time for RAMP. (b) Inference time scaling for each method, independently normalized to its performance in the smallest bucket (i.e., bucket $<25$).
  • Figure 4: Illustration of the graph transformation process for RAMP in GraphQA task. (a) An example of two arguments with a "Support" relation from the ExplaGraphs dataset. (b) The original graph structure where relations are represented as labeled edges. (c) Our transformed graph, where each original edge is reified into a dedicated node (orange), creating a bipartite-like structure that allows RAMP to process textual relational information more effectively.
  • Figure 5: Scalability on PubMed. (a) Accuracy and absolute inference time for RAMP. (b) Inference time scaling for each method, independently normalized to its performance in the smallest bucket (i.e., bucket $<10$).