Table of Contents
Fetching ...

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore

TL;DR

GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL, iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances.

Abstract

We present GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL. By casting generalized planning as an optimization problem, GenePlan iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances. In empirical evaluation across six existing benchmark domains and two new domains, GenePlan achieved an average SAT score of 0.91, closely matching the performance of the state-of-the-art planners (SAT score 0.93), and significantly outperforming other LLM-based baselines such as chain-of-thought (CoT) prompting (average SAT score 0.64). The generated planners solve new instances rapidly (average 0.49 seconds per task) and at low cost (average $1.82 per domain using GPT-4o).

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

TL;DR

GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL, iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances.

Abstract

We present GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL. By casting generalized planning as an optimization problem, GenePlan iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances. In empirical evaluation across six existing benchmark domains and two new domains, GenePlan achieved an average SAT score of 0.91, closely matching the performance of the state-of-the-art planners (SAT score 0.93), and significantly outperforming other LLM-based baselines such as chain-of-thought (CoT) prompting (average SAT score 0.64). The generated planners solve new instances rapidly (average 0.49 seconds per task) and at low cost (average $1.82 per domain using GPT-4o).
Paper Structure (31 sections, 3 equations, 4 figures, 2 tables, 18 algorithms)

This paper contains 31 sections, 3 equations, 4 figures, 2 tables, 18 algorithms.

Figures (4)

  • Figure 1: Figure showing architecture of GenePlan. Python planners are stored in the planner_db. The right loop (1-9) generates new candidate planners, while the left loop (10-11) prunes low scoring candidates at the end of each generation. After running, the best planner can be extracted by querying the planner_db (12)
  • Figure 2: Evolutionary prompt template.
  • Figure 3: Critical difference diagram showing average rank per method across all problem instances. Lower ranks (left) are better and the horizontal lines connecting approaches indicate statistical indistinguishability.
  • Figure 4: Normalized score $\hat{f}(\Phi^*_t) / \hat{f}(\Phi^*)$ versus generation $t$. $\hat{f}(\Phi^*)$ is the optimal average plan length found by Fast Downward on the training tasks (or best found within 1 hour for domains labelled +), and $\hat{f}(\Phi^*_t)$ is GenePlan's current incumbent solution.

Theorems & Definitions (10)

  • Definition 1: PDDL Planning Task
  • Definition 2: Generalized Planning Instance
  • Definition 3: Generalized Plan
  • Definition 4: Generalized Planning Optimization Problem
  • Definition 5: Population
  • Definition 6: Fitness Function
  • Definition 7: Selection
  • Definition 8: Crossover
  • Definition 9: Mutation
  • Definition 10: Replacement Strategy