Abstraction Generation for Generalized Planning with Pretrained Large Language Models
Zhenhe Cui, Huaxiang Xia, Hangjun Shen, Kailun Luo, Yong He, Wei Liang
TL;DR
The paper tackles the challenge of generating robust $QNP$ abstractions for Generalized Planning by leveraging pretrained LLMs through a structured prompt protocol that first derives domain features and then outputs a $QNP$ abstraction, including an automated debugging loop to repair abstraction errors. The approach hinges on a refinement-mapping framework that relates high-level $QNP$ abstractions to low-level GP problems, with iterative checks (ASC, HLISC, HLPRC, LLGRC) and a $QNP$ solver ($DSET$) to guide corrections. Empirical results across seven GP domains show that certain LLMs (notably GPT-5.2) can produce useful abstractions when guided by debugging, while performance varies across models and domains; debugging generally improves coverage, though domain-specific limitations and action-refinement constraints remain challenges. Overall, the work demonstrates a practical pathway to harness LLMs for GP abstraction generation, highlighting both potential benefits and avenues for future enhancement toward more expressive features and stronger abstraction guarantees.
Abstract
Qualitative Numerical Planning (QNP) serves as an important abstraction model for generalized planning (GP), which aims to compute general plans that solve multiple instances at once. Recent works show that large language models (LLMs) can function as generalized planners. This work investigates whether LLMs can serve as QNP abstraction generators for GP problems and how to fix abstractions via automated debugging. We propose a prompt protocol: input a GP domain and training tasks to LLMs, prompting them to generate abstract features and further abstract the initial state, action set, and goal into QNP problems. An automated debugging method is designed to detect abstraction errors, guiding LLMs to fix abstractions. Experiments demonstrate that under properly guided by automated debugging, some LLMs can generate useful QNP abstractions.
