Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning
Katharina Stein, Daniel Fišer, Jörg Hoffmann, Alexander Koller
TL;DR
The paper addresses whether large language models can perform planning in PDDL domains and tackles the bottleneck of manually crafting natural language prompts by automatically converting PDDL to NL descriptions. It introduces PDDL2NL, an automatic translation pipeline that uses LLMs to generate NL predicates and actions, enabling LLMs to perform action choice and planning across diverse domains. The work presents a broad experimental evaluation comparing automated NL prompts to manual prompts, PDDL prompts, and template-based prompts, and analyzes four LLM mechanisms (Basic, CoT, Act, ReAct) against symbolic baselines, showing that NL prompts can match manual prompts and that CoT/ReAct provide notable gains. While LLM-based planning lags behind state-of-the-art symbolic planners overall, the approach demonstrates promising, search-free capabilities and lays groundwork for hybrid neurosymbolic planning approaches in the未来 of AI planning.
Abstract
Large language models (LLMs) have revolutionized a large variety of NLP tasks. An active debate is to what extent they can do reasoning and planning. Prior work has assessed the latter in the specific context of PDDL planning, based on manually converting three PDDL domains into natural language (NL) prompts. Here we automate this conversion step, showing how to leverage an LLM to automatically generate NL prompts from PDDL input. Our automatically generated NL prompts result in similar LLM-planning performance as the previous manually generated ones. Beyond this, the automation enables us to run much larger experiments, providing for the first time a broad evaluation of LLM planning performance in PDDL. Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts. Compared to symbolic planners, LLM planning lags far behind; but in some domains, our best LLM configuration scales up further than A$^\star$ using LM-cut.
