Table of Contents
Fetching ...

Complex LLM Planning via Automated Heuristics Discovery

Hongyi Ling, Shubham Parashar, Sambhav Khurana, Blake Olson, Anwesha Basu, Gaurangi Sinha, Zhengzhong Tu, James Caverlee, Shuiwang Ji

TL;DR

The paper addresses the challenge of enabling LLMs to perform complex planning without additional training. It introduces Automated Heuristics Discovery (AutoHD), where LLMs generate explicit Python heuristic functions that guide inference-time search, with an evolutionary loop to refine them. By integrating these heuristics into search algorithms like Greedy BFS and A*, AutoHD achieves significant accuracy gains on Blocksworld, the Game of 24, and Rubik's Cube across multiple LLMs, and provides interpretable insights into the reasoning process. The approach reduces reliance on self-verification or external verifiers and demonstrates strong generalization and robustness across tasks, establishing AutoHD as a practical, interpretable framework for complex planning.

Abstract

We consider enhancing large language models (LLMs) for complex planning tasks. While existing methods allow LLMs to explore intermediate steps to make plans, they either depend on unreliable self-verification or external verifiers to evaluate these steps, which demand significant data and computations. Here, we propose automated heuristics discovery (AutoHD), a novel approach that enables LLMs to explicitly generate heuristic functions to guide inference-time search, allowing accurate evaluation of intermediate states. These heuristic functions are further refined through a heuristic evolution process, improving their robustness and effectiveness. Our proposed method requires no additional model training or fine-tuning, and the explicit definition of heuristic functions generated by the LLMs provides interpretability and insights into the reasoning process. Extensive experiments across diverse benchmarks demonstrate significant gains over multiple baselines, including nearly twice the accuracy on some datasets, establishing our approach as a reliable and interpretable solution for complex planning tasks.

Complex LLM Planning via Automated Heuristics Discovery

TL;DR

The paper addresses the challenge of enabling LLMs to perform complex planning without additional training. It introduces Automated Heuristics Discovery (AutoHD), where LLMs generate explicit Python heuristic functions that guide inference-time search, with an evolutionary loop to refine them. By integrating these heuristics into search algorithms like Greedy BFS and A*, AutoHD achieves significant accuracy gains on Blocksworld, the Game of 24, and Rubik's Cube across multiple LLMs, and provides interpretable insights into the reasoning process. The approach reduces reliance on self-verification or external verifiers and demonstrates strong generalization and robustness across tasks, establishing AutoHD as a practical, interpretable framework for complex planning.

Abstract

We consider enhancing large language models (LLMs) for complex planning tasks. While existing methods allow LLMs to explore intermediate steps to make plans, they either depend on unreliable self-verification or external verifiers to evaluate these steps, which demand significant data and computations. Here, we propose automated heuristics discovery (AutoHD), a novel approach that enables LLMs to explicitly generate heuristic functions to guide inference-time search, allowing accurate evaluation of intermediate states. These heuristic functions are further refined through a heuristic evolution process, improving their robustness and effectiveness. Our proposed method requires no additional model training or fine-tuning, and the explicit definition of heuristic functions generated by the LLMs provides interpretability and insights into the reasoning process. Extensive experiments across diverse benchmarks demonstrate significant gains over multiple baselines, including nearly twice the accuracy on some datasets, establishing our approach as a reliable and interpretable solution for complex planning tasks.

Paper Structure

This paper contains 21 sections, 1 equation, 5 figures, 10 tables, 3 algorithms.

Figures (5)

  • Figure 1: Comparison between existing methods and the proposed method. (a) CoT follows a single linear path, thereby constraining its exploratory capacity. (b) CoT-SC extends this approach by performing multiple CoT iterations, leading to a result with higher confidence scores. (c) ToT introduces a tree-based search mechanism, branching systematically through intermediate states to explore a broader solution space. (D) In contrast, our AutoHD uses a heuristic function generated by the LLM to guide exploration. The heuristic prioritizes promising states, enabling more efficient and effective navigation of the solution space.
  • Figure 2: An example heuristic function proposed by LLMs for Blocksworld. The heuristic function computes the number of misplaced blocks and the cumulative positional differences of these blocks. The resulting sum provides an estimation of the discrepancy between states.
  • Figure 3: Heurisitc discovery process of the proposed method. The LLM is prompted to generate a diverse set of initial heuristic functions. These functions are evaluated on validation sets through heuristic-guided search to assess their quality. The top-performing heuristic functions are filtered and evolved to create the next generation by exploring new heuristic functions and refining existing ones. After K evolutions, the best heuristic function across all generations is selected for testing.
  • Figure 4: Ablation study of heuristic evolution on Rubik's Cube dataset. The evolutionary process improves validation and test accuracies significantly in early generations before plateauing, indicating the discovery of a robust heuristic function.
  • Figure 5: An example in Rubik's Cube dataset. The task is to transform a cube from a scrambled initial state to a goal state.