Table of Contents
Fetching ...

Are Large Language Models In-Context Graph Learners?

Jintang Li, Ruofan Wu, Yuchang Zhu, Huizhe Zhang, Liang Chen, Zibin Zheng

TL;DR

This work addresses the gap between large language models and graph structured data by reframing graph learning as retrieval augmented generation. It establishes a formal connection between message passing in graph neural networks and RAG, and introduces three graph guided RAG frameworks—QueryRAG, LabelRAG, and FewshotRAG—that leverage local graph neighborhoods to provide in context signals to LLMs without fine tuning. Through extensive experiments on text attributed graphs, the authors show that these RAG frameworks substantially improve in context performance, with LabelRAG and FewshotRAG sometimes matching or surpassing supervised MLPs and approaching the efficacy of GNNs. The results demonstrate a viable path for using off the shelf LLMs in graph tasks by exploiting structured graph context and retrieval guided reasoning, broadening the applicability of LLMs to knowledge rich graph domains.

Abstract

Large language models (LLMs) have demonstrated remarkable in-context reasoning capabilities across a wide range of tasks, particularly with unstructured inputs such as language or images. However, LLMs struggle to handle structured data, such as graphs, due to their lack of understanding of non-Euclidean structures. As a result, without additional fine-tuning, their performance significantly lags behind that of graph neural networks (GNNs) in graph learning tasks. In this paper, we show that learning on graph data can be conceptualized as a retrieval-augmented generation (RAG) process, where specific instances (e.g., nodes or edges) act as queries, and the graph itself serves as the retrieved context. Building on this insight, we propose a series of RAG frameworks to enhance the in-context learning capabilities of LLMs for graph learning tasks. Comprehensive evaluations demonstrate that our proposed RAG frameworks significantly improve LLM performance on graph-based tasks, particularly in scenarios where a pretrained LLM must be used without modification or accessed via an API.

Are Large Language Models In-Context Graph Learners?

TL;DR

This work addresses the gap between large language models and graph structured data by reframing graph learning as retrieval augmented generation. It establishes a formal connection between message passing in graph neural networks and RAG, and introduces three graph guided RAG frameworks—QueryRAG, LabelRAG, and FewshotRAG—that leverage local graph neighborhoods to provide in context signals to LLMs without fine tuning. Through extensive experiments on text attributed graphs, the authors show that these RAG frameworks substantially improve in context performance, with LabelRAG and FewshotRAG sometimes matching or surpassing supervised MLPs and approaching the efficacy of GNNs. The results demonstrate a viable path for using off the shelf LLMs in graph tasks by exploiting structured graph context and retrieval guided reasoning, broadening the applicability of LLMs to knowledge rich graph domains.

Abstract

Large language models (LLMs) have demonstrated remarkable in-context reasoning capabilities across a wide range of tasks, particularly with unstructured inputs such as language or images. However, LLMs struggle to handle structured data, such as graphs, due to their lack of understanding of non-Euclidean structures. As a result, without additional fine-tuning, their performance significantly lags behind that of graph neural networks (GNNs) in graph learning tasks. In this paper, we show that learning on graph data can be conceptualized as a retrieval-augmented generation (RAG) process, where specific instances (e.g., nodes or edges) act as queries, and the graph itself serves as the retrieved context. Building on this insight, we propose a series of RAG frameworks to enhance the in-context learning capabilities of LLMs for graph learning tasks. Comprehensive evaluations demonstrate that our proposed RAG frameworks significantly improve LLM performance on graph-based tasks, particularly in scenarios where a pretrained LLM must be used without modification or accessed via an API.

Paper Structure

This paper contains 45 sections, 8 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: A technical comparison between GNN and RAG. Both leverage contextual information beyond the raw input throughhomogeneous and heterogeneous structures, respectively.
  • Figure 2: Illustration of the proposed QueryRAG, LabelRAG, and FewshotRAG frameworks. Specifically, QueryRAG uses the retrieved queries as references, LabelRAG incorporates only the corresponding labels, and FewshotRAG combines both query and label information as few-shot context..
  • Figure 3: In-context node classification results on Cora and Pubmed datasets. GNN and MLP are supervisedly trained on partial nodes.
  • Figure 4: In-context node classification results on Cora and Pubmed datasets with our proposed QueryRAG, LabelRAG, and FewshotRAG.
  • Figure 5: In-context performance of QueryRAG, LabelRAG, and FewshotRAG compared to the zero-shot performance of Llama-3.1-8B-Instruct and DeepSeek-V3.
  • ...and 3 more figures