A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

Chengxing Xie; Difan Zou

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

Chengxing Xie, Difan Zou

TL;DR

The paper tackles the challenge of multi-phase planning for LLM agents by proposing a human-inspired framework that decomposes tasks into Outline Generation, Information Collection with Strategy and Knowledge blocks, and Plan Making guided by a plan-search loop. It employs a multi-agent collaboration to generate outlines, collect data, and iteratively refine daily plans, achieving substantial improvements on the TravelPlanner benchmark, including notable gains when paired with GPT-4-Turbo. The key contributions include the introduction of Strategy and Knowledge Blocks to manage information flow and a plan-search mechanism to reduce hallucinations and constraint violations, backed by ablation studies. This work advances practical planning capabilities of LLM agents in complex, real-world tasks and outlines avenues for broader tool integration and responsible deployment.

Abstract

Recent studies have highlighted their proficiency in some simple tasks like writing and coding through various reasoning strategies. However, LLM agents still struggle with tasks that require comprehensive planning, a process that challenges current models and remains a critical research issue. In this study, we concentrate on travel planning, a Multi-Phases planning problem, that involves multiple interconnected stages, such as outlining, information gathering, and planning, often characterized by the need to manage various constraints and uncertainties. Existing reasoning approaches have struggled to effectively address this complex task. Our research aims to address this challenge by developing a human-like planning framework for LLM agents, i.e., guiding the LLM agent to simulate various steps that humans take when solving Multi-Phases problems. Specifically, we implement several strategies to enable LLM agents to generate a coherent outline for each travel query, mirroring human planning patterns. Additionally, we integrate Strategy Block and Knowledge Block into our framework: Strategy Block facilitates information collection, while Knowledge Block provides essential information for detailed planning. Through our extensive experiments, we demonstrate that our framework significantly improves the planning capabilities of LLM agents, enabling them to tackle the travel planning task with improved efficiency and effectiveness. Our experimental results showcase the exceptional performance of the proposed framework; when combined with GPT-4-Turbo, it attains $10\times$ the performance gains in comparison to the baseline framework deployed on GPT-4-Turbo.

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

TL;DR

Abstract

the performance gains in comparison to the baseline framework deployed on GPT-4-Turbo.

Paper Structure (30 sections, 14 figures, 2 tables)

This paper contains 30 sections, 14 figures, 2 tables.

Introduction
Related Works
Reasoning Strategy for LLM Agents
Multi-Agents Framework
Method
Task Description:
Framework Overview
Outline Generation
Information Collection
Plan Making
Experiment
Experiment Setup
Experiment Result
Ablation Study
Conclusion and Future Work
...and 15 more sections

Figures (14)

Figure 1: Our human-like planning framework consists of three key parts. In the Outline Generation phase, LLM agents produce rough plans and identify key information related to the query, establishing the foundation for detailed future planning. In the Information Collection phase, LLM agents gather the essential data required for comprehensive planning. Finally, in the Plan Making phase, LLM agents explore the potential plan space and return a well-structured, reasonable plan.
Figure 2: This is our transportation evaluation process for each route generated by PathFinder Agent. We will evaluate whether this route is reasonable under the limitation from the specific travel planning query, e.g., cannot take flight. If not reasonable, we will redo the route by giving feedback. Otherwise, if the route is not reasonable for some kind of transportation, we will add the constraint.
Figure 3: This image illustrates the Knowledge Block workflow. The left part is the Knowledge Write process, where a function's result is written to the top of the Knowledge Block. About Knowledge Read, when the knowledge needs to be popped out but is below the threshold, the Knowledge Block pops out enough items to meet the threshold. If the number exceeds the threshold, the items in past two days are popped out.
Figure 4: The error distribution about GPT-4-Turbo's Delivery Rate Failure.
Figure 5: The error distribution of GPT-3.5-Turbo and GPT-4-Turbo.
...and 9 more figures

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

TL;DR

Abstract

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (14)