Table of Contents
Fetching ...

Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning

Katharina Stein, Daniel Fišer, Jörg Hoffmann, Alexander Koller

TL;DR

The paper addresses whether large language models can perform planning in PDDL domains and tackles the bottleneck of manually crafting natural language prompts by automatically converting PDDL to NL descriptions. It introduces PDDL2NL, an automatic translation pipeline that uses LLMs to generate NL predicates and actions, enabling LLMs to perform action choice and planning across diverse domains. The work presents a broad experimental evaluation comparing automated NL prompts to manual prompts, PDDL prompts, and template-based prompts, and analyzes four LLM mechanisms (Basic, CoT, Act, ReAct) against symbolic baselines, showing that NL prompts can match manual prompts and that CoT/ReAct provide notable gains. While LLM-based planning lags behind state-of-the-art symbolic planners overall, the approach demonstrates promising, search-free capabilities and lays groundwork for hybrid neurosymbolic planning approaches in the未来 of AI planning.

Abstract

Large language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically generate NL prompts from PDDL input. Our automatically generated NL prompts result in similar LLM-planning performance as the previous manually generated ones. Beyond this, the automation enables us to run much larger experiments, providing for the first time a broad evaluation of LLM planning performance in PDDL. Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts. Compared to symbolic planners, LLM planning lags far behind; but in some domains, our best LLM configuration scales up further than A$^\star$ using LM-cut.

Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning

TL;DR

The paper addresses whether large language models can perform planning in PDDL domains and tackles the bottleneck of manually crafting natural language prompts by automatically converting PDDL to NL descriptions. It introduces PDDL2NL, an automatic translation pipeline that uses LLMs to generate NL predicates and actions, enabling LLMs to perform action choice and planning across diverse domains. The work presents a broad experimental evaluation comparing automated NL prompts to manual prompts, PDDL prompts, and template-based prompts, and analyzes four LLM mechanisms (Basic, CoT, Act, ReAct) against symbolic baselines, showing that NL prompts can match manual prompts and that CoT/ReAct provide notable gains. While LLM-based planning lags behind state-of-the-art symbolic planners overall, the approach demonstrates promising, search-free capabilities and lays groundwork for hybrid neurosymbolic planning approaches in the未来 of AI planning.

Abstract

Large language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically generate NL prompts from PDDL input. Our automatically generated NL prompts result in similar LLM-planning performance as the previous manually generated ones. Beyond this, the automation enables us to run much larger experiments, providing for the first time a broad evaluation of LLM planning performance in PDDL. Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts. Compared to symbolic planners, LLM planning lags far behind; but in some domains, our best LLM configuration scales up further than A using LM-cut.
Paper Structure (38 sections, 20 figures, 7 tables)

This paper contains 38 sections, 20 figures, 7 tables.

Figures (20)

  • Figure 1: Part of the Logistics PDDL domain and problem file and the NL descriptions generated by PDDL2NL.
  • Figure 2: Part of the prompt for converting PDDL predicates into NL consisting of the task description (top), few-shot examples (middle) and the target predicate (bottom).
  • Figure 3: Overview of the set-up for the LLM plan generation and LLM action policy usage.
  • Figure 4: Structure of the prompts for the P-LLM in the LLM planning (left) set-up and in the policy set-up at the second prediction step (right).
  • Figure 5: Structure of the few-shot examples for the four mechanisms.
  • ...and 15 more figures