Table of Contents
Fetching ...

Successor-Generator Planning with LLM-generated Heuristics

Alexander Tuisov, Yonatan Vernik, Alexander Shleyfman

TL;DR

This work introduces a framework where LLMs synthesize problem-specific heuristic functions directly from Explicit Successor Generator (ESG) definitions implemented in Rust, enabling GBFS-based planning without repeated model calls. By translating traditional problem descriptions into ESG components, the approach generates a tailored heuristic that guides search effectively, even for domains with complex numeric constraints or nonstandard transitions. Empirical results across numeric IPC benchmarks and expressive domains demonstrate state-of-the-art performance in many cases, while also highlighting trade-offs between model cost, reasoning effort, and instance-specific information. The method broadens planning expressiveness beyond PDDL and offers a practical, verifiable pipeline for automatic heuristic generation and planning, with future work on robustness and hybrid strategies.

Abstract

Heuristics are a central component of deterministic planning, particularly in domain-independent settings where general applicability is prioritized over task-specific tuning. This work revisits that paradigm in light of recent advances in large language models (LLMs), which enable the automatic synthesis of heuristics directly from problem definitions -- bypassing the need for handcrafted domain knowledge. We present a method that employs LLMs to generate problem-specific heuristic functions from planning tasks specified through successor generators, goal tests, and initial states written in a general-purpose programming language. These heuristics are compiled and integrated into standard heuristic search algorithms, such as greedy best-first search. Our approach achieves competitive, and in many cases state-of-the-art, performance across a broad range of established planning benchmarks. Moreover, it enables the solution of problems that are difficult to express in traditional formalisms, including those with complex numeric constraints or custom transition dynamics. We provide an extensive empirical evaluation that characterizes the strengths and limitations of the approach across diverse planning settings, demonstrating its effectiveness.

Successor-Generator Planning with LLM-generated Heuristics

TL;DR

This work introduces a framework where LLMs synthesize problem-specific heuristic functions directly from Explicit Successor Generator (ESG) definitions implemented in Rust, enabling GBFS-based planning without repeated model calls. By translating traditional problem descriptions into ESG components, the approach generates a tailored heuristic that guides search effectively, even for domains with complex numeric constraints or nonstandard transitions. Empirical results across numeric IPC benchmarks and expressive domains demonstrate state-of-the-art performance in many cases, while also highlighting trade-offs between model cost, reasoning effort, and instance-specific information. The method broadens planning expressiveness beyond PDDL and offers a practical, verifiable pipeline for automatic heuristic generation and planning, with future work on robustness and hybrid strategies.

Abstract

Heuristics are a central component of deterministic planning, particularly in domain-independent settings where general applicability is prioritized over task-specific tuning. This work revisits that paradigm in light of recent advances in large language models (LLMs), which enable the automatic synthesis of heuristics directly from problem definitions -- bypassing the need for handcrafted domain knowledge. We present a method that employs LLMs to generate problem-specific heuristic functions from planning tasks specified through successor generators, goal tests, and initial states written in a general-purpose programming language. These heuristics are compiled and integrated into standard heuristic search algorithms, such as greedy best-first search. Our approach achieves competitive, and in many cases state-of-the-art, performance across a broad range of established planning benchmarks. Moreover, it enables the solution of problems that are difficult to express in traditional formalisms, including those with complex numeric constraints or custom transition dynamics. We provide an extensive empirical evaluation that characterizes the strengths and limitations of the approach across diverse planning settings, demonstrating its effectiveness.

Paper Structure

This paper contains 15 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Procedure flowchart. A general problem description is manually written using Rust components—successor generator, goal test, and initial state---which are then fed into the system. For problems already described in pddl2.1, the Rust translation is derived directly from the encoding.
  • Figure 2: Per-instance comparisons of the Total Time (left) and expanded states (right) between GPT-4.1 with SelfPortfolio-10 and $h^{\text{add}}_{\langle \text{B, QB}\rangle}$. Points below the diagonal favor our approach. On problems both solve they seem to provide similar levels of heuristic guidance.
  • Figure 3: Per-instance comparison of the Total Time (generation + run) (up) and expanded states (down) between GPT-5.1 with and without setting the reasoning_effort to high. Points below the diagonal favor high reasoning effort. Allowing increased reasoning effort moderately increases heuristic quality, but comes at a heavy expense of time.
  • Figure 4: As a variance analysis we generated 40 heuristics with GPT-4.1 in each domain, and performed a Monte-Carlo simulation of our algorithm for 1000 iterations of sampling heuristic order, taking the coverage per-domain and overall each time. The boxes represent the first and third quartiles, the thick line the median, and the whiskers the 1st and 99th percentiles. Although limited, the analysis shows the method to be highly consistent for most domains.