Table of Contents
Fetching ...

Thought of Search: Planning with Language Models Through The Lens of Efficiency

Michael Katz, Harsha Kokel, Kavitha Srinivas, Shirin Sohrabi

TL;DR

The paper scrutinizes how large language models have been used for planning, showing that many LM-based approaches sacrifice soundness, completeness, or efficiency. It introduces Thought of Search (ToS), a method that uses LLMs to generate Python implementations of essential search components (successor function and goal test) which are then used by classical search algorithms to guarantee sound, complete (and sometimes optimal) solutions with far fewer LM calls. Through experiments on four problems—24 Game, mini crosswords, BlocksWorld, and PrOntoQA—the authors report 100% solution accuracy and substantial resource savings compared to LM-only planning methods. The work advocates responsible compute use and highlights a path toward robust, efficient LM-assisted planning by combining symbolic components with targeted LM-generated code. The overall contribution is a principled framework for leveraging LLMs to supply symbolic search tools while preserving core algorithmic guarantees and improving practicality.

Abstract

Among the most important properties of algorithms investigated in computer science are soundness, completeness, and complexity. These properties, however, are rarely analyzed for the vast collection of recently proposed methods for planning with large language models. In this work, we alleviate this gap. We analyse these properties of using LLMs for planning and highlight that recent trends abandon both soundness and completeness for the sake of inefficiency. We propose a significantly more efficient approach that can, at the same time, maintain both soundness and completeness. We exemplify on four representative search problems, comparing to the LLM-based solutions from the literature that attempt to solve these problems. We show that by using LLMs to produce the code for the search components we can solve the entire datasets with 100\% accuracy with only a few calls to the LLM. We argue for a responsible use of compute resources; urging research community to investigate sound and complete LLM-based approaches that uphold efficiency.

Thought of Search: Planning with Language Models Through The Lens of Efficiency

TL;DR

The paper scrutinizes how large language models have been used for planning, showing that many LM-based approaches sacrifice soundness, completeness, or efficiency. It introduces Thought of Search (ToS), a method that uses LLMs to generate Python implementations of essential search components (successor function and goal test) which are then used by classical search algorithms to guarantee sound, complete (and sometimes optimal) solutions with far fewer LM calls. Through experiments on four problems—24 Game, mini crosswords, BlocksWorld, and PrOntoQA—the authors report 100% solution accuracy and substantial resource savings compared to LM-only planning methods. The work advocates responsible compute use and highlights a path toward robust, efficient LM-assisted planning by combining symbolic components with targeted LM-generated code. The overall contribution is a principled framework for leveraging LLMs to supply symbolic search tools while preserving core algorithmic guarantees and improving practicality.

Abstract

Among the most important properties of algorithms investigated in computer science are soundness, completeness, and complexity. These properties, however, are rarely analyzed for the vast collection of recently proposed methods for planning with large language models. In this work, we alleviate this gap. We analyse these properties of using LLMs for planning and highlight that recent trends abandon both soundness and completeness for the sake of inefficiency. We propose a significantly more efficient approach that can, at the same time, maintain both soundness and completeness. We exemplify on four representative search problems, comparing to the LLM-based solutions from the literature that attempt to solve these problems. We show that by using LLMs to produce the code for the search components we can solve the entire datasets with 100\% accuracy with only a few calls to the LLM. We argue for a responsible use of compute resources; urging research community to investigate sound and complete LLM-based approaches that uphold efficiency.
Paper Structure (16 sections)