Table of Contents
Fetching ...

PROC2PDDL: Open-Domain Planning Representations from Texts

Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon

TL;DR

This work presents Proc2PDDL, the first open-domain dataset that maps procedural natural language texts to PDDL representations, enabling evaluation of text-to-planning in diverse domains. It formulates action modeling as predicting a $ abla$DF from text $\mathbb{T}$ and header $H$, and evaluates intrinsic domain-definition accuracy and extrinsic plan solvability via a BFS-based PDDL planner. A Zone of Proximal Development (ZPD) prompting strategy—breaking the task into Extraction, Inference, and Translation—improves LM performance, yet GPT-4-level models still struggle to accurately generate domain actions or reliably solve problems from $ abla$PFs. The dataset uses wikiHow procedures to test open-domain transfer and highlights both syntactic and semantic errors as key bottlenecks for current LMs in symbolic planning. Overall, Proc2PDDL and the ZPD methodology offer a path toward integrating language understanding with formal planning, motivating further research in LM-driven open-domain planning.

Abstract

Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. Using this dataset, we evaluate state-of-the-art models on defining the preconditions and effects of actions. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%. Our analysis shows both syntactic and semantic errors, indicating LMs' deficiency in both generating domain-specific prgorams and reasoning about events. We hope this analysis and dataset helps future progress towards integrating the best of LMs and formal planning.

PROC2PDDL: Open-Domain Planning Representations from Texts

TL;DR

This work presents Proc2PDDL, the first open-domain dataset that maps procedural natural language texts to PDDL representations, enabling evaluation of text-to-planning in diverse domains. It formulates action modeling as predicting a DF from text and header , and evaluates intrinsic domain-definition accuracy and extrinsic plan solvability via a BFS-based PDDL planner. A Zone of Proximal Development (ZPD) prompting strategy—breaking the task into Extraction, Inference, and Translation—improves LM performance, yet GPT-4-level models still struggle to accurately generate domain actions or reliably solve problems from PFs. The dataset uses wikiHow procedures to test open-domain transfer and highlights both syntactic and semantic errors as key bottlenecks for current LMs in symbolic planning. Overall, Proc2PDDL and the ZPD methodology offer a path toward integrating language understanding with formal planning, motivating further research in LM-driven open-domain planning.

Abstract

Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. Using this dataset, we evaluate state-of-the-art models on defining the preconditions and effects of actions. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%. Our analysis shows both syntactic and semantic errors, indicating LMs' deficiency in both generating domain-specific prgorams and reasoning about events. We hope this analysis and dataset helps future progress towards integrating the best of LMs and formal planning.
Paper Structure (17 sections, 2 figures, 5 tables)

This paper contains 17 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: A PDDL solver produces a plan based on a minimal domain file and problem file. Previous work assumes the domain file as given, while we predict the action definitions in the domain file.
  • Figure 2: Our formulation of the $\mathbb{DF}$ action prediction task is as follows: given a natural language procedure text and a domain file header, a language model (LM) follows Zone of Proximal Development (ZPD) instructions in three sequential skills to predict domain actions, including parameters, preconditions, and effects. During evaluation, the predicted $\mathbb{DF}$ is compared to a gold reference and used to solve corresponding $\mathbb{PF}$s.