Large Language Models for Planning: A Comprehensive and Systematic Survey
Pengfei Cao, Tianyi Men, Wencan Liu, Jingwen Zhang, Xuzhao Li, Xixun Lin, Dianbo Sui, Yanan Cao, Kang Liu, Jun Zhao
TL;DR
This survey provides a structured, up-to-date overview of planning with large language models, framing the problem as a sequence of states and actions within MDP/POMDP, and classifying approaches into External Module Augmented, Finetuning-based, and Searching-based paradigms. It details methods from PDDL-based translator pipelines and memory-augmented agents to imitation/feedback-fueled fine-tuning and diverse search/decomposition strategies, alongside comprehensive evaluation frameworks spanning digital to embodied domains. The work analyzes mechanisms behind LLM planning, including chain-of-thought and look-ahead planning, and discusses interpretability, limitations, and future directions such as RL integration, environmental constraints, personalization, edge deployment, and improved generalization. The authors also release a large resource collection to support ongoing research and benchmarking in this rapidly evolving field.
Abstract
Planning represents a fundamental capability of intelligent agents, requiring comprehensive environmental understanding, rigorous logical reasoning, and effective sequential decision-making. While Large Language Models (LLMs) have demonstrated remarkable performance on certain planning tasks, their broader application in this domain warrants systematic investigation. This paper presents a comprehensive review of LLM-based planning. Specifically, this survey is structured as follows: First, we establish the theoretical foundations by introducing essential definitions and categories about automated planning. Next, we provide a detailed taxonomy and analysis of contemporary LLM-based planning methodologies, categorizing them into three principal approaches: 1) External Module Augmented Methods that combine LLMs with additional components for planning, 2) Finetuning-based Methods that involve using trajectory data and feedback signals to adjust LLMs in order to improve their planning abilities, and 3) Searching-based Methods that break down complex tasks into simpler components, navigate the planning space, or enhance decoding strategies to find the best solutions. Subsequently, we systematically summarize existing evaluation frameworks, including benchmark datasets, evaluation metrics and performance comparisons between representative planning methods. Finally, we discuss the underlying mechanisms enabling LLM-based planning and outline promising research directions for this rapidly evolving field. We hope this survey will serve as a valuable resource to inspire innovation and drive progress in this field.
