Table of Contents
Fetching ...

Graph2text or Graph2token: A Perspective of Large Language Models for Graph Learning

Shuo Yu, Yingbo Wang, Ruolin Li, Guchun Liu, Yanming Shen, Shaoxiong Ji, Bowen Li, Fengling Han, Xiuzhen Zhang, Feng Xia

TL;DR

This paper surveys how large language models can be applied to graph learning by transforming graphs into text (Graph2text) or into tokens (Graph2token). It introduces a problem-centered taxonomy, identifies four core transformation challenges—alignment, position, multi-level semantics, and context—and reviews methods for AMR graphs and Knowledge Graphs, as well as general graphs. The work provides practical guidelines on encoder choices, prompts, and fine-tuning, and highlights open problems including efficiency, dynamic graphs, and fairness, offering directions for future research. By leveraging LLMs’ text-processing and world knowledge, LLM4graph aims to improve semantic understanding and generalization in graph-driven tasks across domains, though it faces notable challenges in scalability and dynamic settings.

Abstract

Graphs are data structures used to represent irregular networks and are prevalent in numerous real-world applications. Previous methods directly model graph structures and achieve significant success. However, these methods encounter bottlenecks due to the inherent irregularity of graphs. An innovative solution is converting graphs into textual representations, thereby harnessing the powerful capabilities of Large Language Models (LLMs) to process and comprehend graphs. In this paper, we present a comprehensive review of methodologies for applying LLMs to graphs, termed LLM4graph. The core of LLM4graph lies in transforming graphs into texts for LLMs to understand and analyze. Thus, we propose a novel taxonomy of LLM4graph methods in the view of the transformation. Specifically, existing methods can be divided into two paradigms: Graph2text and Graph2token, which transform graphs into texts or tokens as the input of LLMs, respectively. We point out four challenges during the transformation to systematically present existing methods in a problem-oriented perspective. For practical concerns, we provide a guideline for researchers on selecting appropriate models and LLMs for different graphs and hardware constraints. We also identify five future research directions for LLM4graph.

Graph2text or Graph2token: A Perspective of Large Language Models for Graph Learning

TL;DR

This paper surveys how large language models can be applied to graph learning by transforming graphs into text (Graph2text) or into tokens (Graph2token). It introduces a problem-centered taxonomy, identifies four core transformation challenges—alignment, position, multi-level semantics, and context—and reviews methods for AMR graphs and Knowledge Graphs, as well as general graphs. The work provides practical guidelines on encoder choices, prompts, and fine-tuning, and highlights open problems including efficiency, dynamic graphs, and fairness, offering directions for future research. By leveraging LLMs’ text-processing and world knowledge, LLM4graph aims to improve semantic understanding and generalization in graph-driven tasks across domains, though it faces notable challenges in scalability and dynamic settings.

Abstract

Graphs are data structures used to represent irregular networks and are prevalent in numerous real-world applications. Previous methods directly model graph structures and achieve significant success. However, these methods encounter bottlenecks due to the inherent irregularity of graphs. An innovative solution is converting graphs into textual representations, thereby harnessing the powerful capabilities of Large Language Models (LLMs) to process and comprehend graphs. In this paper, we present a comprehensive review of methodologies for applying LLMs to graphs, termed LLM4graph. The core of LLM4graph lies in transforming graphs into texts for LLMs to understand and analyze. Thus, we propose a novel taxonomy of LLM4graph methods in the view of the transformation. Specifically, existing methods can be divided into two paradigms: Graph2text and Graph2token, which transform graphs into texts or tokens as the input of LLMs, respectively. We point out four challenges during the transformation to systematically present existing methods in a problem-oriented perspective. For practical concerns, we provide a guideline for researchers on selecting appropriate models and LLMs for different graphs and hardware constraints. We also identify five future research directions for LLM4graph.
Paper Structure (53 sections, 19 equations, 7 figures, 1 table)

This paper contains 53 sections, 19 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The illustration of LLM4graph. LLMs understand and process various types of graph data through Graph2text or Graph2token.
  • Figure 2: A taxonomy of research on large language models for graph learning.
  • Figure 3: Challenges in transforming graphs to texts. (a) The alignment problem. (b) The position problem. (c) The multi-level semantics problem. (d) The context problem.
  • Figure 4: Direct textualisation with Natural language generation and GDL generation.
  • Figure 5: Technical approach of the indirect textualization method.
  • ...and 2 more figures