Table of Contents
Fetching ...

Can Graph Learning Improve Planning in LLM-based Agents?

Xixi Wu, Yifei Shen, Caihua Shan, Kaitao Song, Siwei Wang, Bohang Zhang, Jiarui Feng, Hong Cheng, Wei Chen, Yun Xiong, Dongsheng Li

TL;DR

The paper tackles task planning for LLM-based agents by casting sub-tasks as a graph and showing that integrating GNNs with LLMs yields robust improvements, especially as task graphs grow larger. It provides both theoretical insights into the expressiveness and biases of Transformers on graph inputs and empirical evidence that GNNs can greatly reduce hallucinations and improve plan quality. The authors present training-free and training-based GNN approaches, demonstrate strong gains across diverse datasets and LLMs, and show that the benefits scale with graph size. They also demonstrate that the gains are orthogonal to prompts and LM fine-tuning, suggesting a practical, efficient path to more reliable graph-based planning in real systems.

Abstract

Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests in natural language into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, task planning is a decision-making problem that involves selecting a connected path or subgraph within the corresponding graph and invoking it. In this paper, we explore graph learning-based methods for task planning, a direction that is orthogonal to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs, which is adeptly addressed by graph neural networks (GNNs). This theoretical insight led us to integrate GNNs with LLMs to enhance overall performance. Extensive experiments demonstrate that GNN-based methods surpass existing solutions even without training, and minimal training can further enhance their performance. The performance gain increases with a larger task graph size.

Can Graph Learning Improve Planning in LLM-based Agents?

TL;DR

The paper tackles task planning for LLM-based agents by casting sub-tasks as a graph and showing that integrating GNNs with LLMs yields robust improvements, especially as task graphs grow larger. It provides both theoretical insights into the expressiveness and biases of Transformers on graph inputs and empirical evidence that GNNs can greatly reduce hallucinations and improve plan quality. The authors present training-free and training-based GNN approaches, demonstrate strong gains across diverse datasets and LLMs, and show that the benefits scale with graph size. They also demonstrate that the gains are orthogonal to prompts and LM fine-tuning, suggesting a practical, efficient path to more reliable graph-based planning in real systems.

Abstract

Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs). It aims to break down complex user requests in natural language into solvable sub-tasks, thereby fulfilling the original requests. In this context, the sub-tasks can be naturally viewed as a graph, where the nodes represent the sub-tasks, and the edges denote the dependencies among them. Consequently, task planning is a decision-making problem that involves selecting a connected path or subgraph within the corresponding graph and invoking it. In this paper, we explore graph learning-based methods for task planning, a direction that is orthogonal to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs, which is adeptly addressed by graph neural networks (GNNs). This theoretical insight led us to integrate GNNs with LLMs to enhance overall performance. Extensive experiments demonstrate that GNN-based methods surpass existing solutions even without training, and minimal training can further enhance their performance. The performance gain increases with a larger task graph size.
Paper Structure (53 sections, 7 theorems, 6 equations, 11 figures, 14 tables)

This paper contains 53 sections, 7 theorems, 6 equations, 11 figures, 14 tables.

Key Result

Theorem 1

(LLMs have enough expressiveness) Assume the input format is given in eq:edge_list and $f,g,\square$ in DP update eq:dp satisfy the assumptions assumption:fgh and assumption:F in Appendix. There exists a log-precision constant-depth and constant-width Transformer that simulates one step of DP update

Figures (11)

  • Figure 1: Illustration of Task Planning in Language Agents (e.g., HuggingGPT shen2024hugginggpt)
  • Figure 2: Illustration of (a) LLMs' planning performance and hallucination in HuggingGPT, and (b) hallucination in relation to task graph size.
  • Figure 3: Orthogonal Effectiveness to both Improved Prompts and Fine-tuned LLMs
  • Figure 4: Illustrative details of experimental datasets.
  • Figure 5: Illustrative Examples of LLMs Failure to Solve Graph Computational Problems under Permutation (i.e., node re-odering). Experiments were conducted for 30 times.
  • ...and 6 more figures

Theorems & Definitions (11)

  • Theorem 1
  • Proposition 1
  • Theorem 2
  • Example 1
  • Theorem 3
  • proof
  • Lemma 1
  • Proposition 2
  • proof
  • Theorem 4
  • ...and 1 more