Table of Contents
Fetching ...

PDDLEGO: Iterative Planning in Textual Environments

Li Zhang, Peter Jansen, Tianyi Zhang, Peter Clark, Chris Callison-Burch, Niket Tandon

TL;DR

The paper tackles planning under partial observability in textual environments, where end-to-end LLM planning struggles due to incomplete knowledge. It introduces PDDLEGO, a neurosymbolic framework that iteratively builds a PDDL representation during exploration by employing two LL-based modalities: PDDL-gen to generate a full problem file and PDDL-edit to apply constrained edits to the current PF, guided by sub-goals when the end-goal is unattainable. Empirically, PDDLEGO improves planning efficiency and success rate across two text-game benchmarks (Coin Collector and Cooking World) compared with end-to-end action generation, achieving a 43% efficiency gain on Coin Collector and up to 98% success in Cooking World easy, with substantial robustness in harder variants. The approach also improves interpretability and correctability by constraining the planning task to a deterministically solvable PF, albeit at the cost of slower PDDL generation and a requirement for domain-file annotations and sub-goal structures.

Abstract

Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner. However, existing methods rely on a fully-observed environment where all entity states are initially known, so a one-off representation can be constructed, leading to a complete plan. In contrast, we tackle partially-observed environments where there is initially no sufficient information to plan for the end-goal. We propose PDDLEGO that iteratively construct a planning representation that can lead to a partial plan for a given sub-goal. By accomplishing the sub-goal, more information is acquired to augment the representation, eventually achieving the end-goal. We show that plans produced by few-shot PDDLEGO are 43% more efficient than generating plans end-to-end on the Coin Collector simulation, with strong performance (98%) on the more complex Cooking World simulation where end-to-end LLMs fail to generate coherent plans (4%).

PDDLEGO: Iterative Planning in Textual Environments

TL;DR

The paper tackles planning under partial observability in textual environments, where end-to-end LLM planning struggles due to incomplete knowledge. It introduces PDDLEGO, a neurosymbolic framework that iteratively builds a PDDL representation during exploration by employing two LL-based modalities: PDDL-gen to generate a full problem file and PDDL-edit to apply constrained edits to the current PF, guided by sub-goals when the end-goal is unattainable. Empirically, PDDLEGO improves planning efficiency and success rate across two text-game benchmarks (Coin Collector and Cooking World) compared with end-to-end action generation, achieving a 43% efficiency gain on Coin Collector and up to 98% success in Cooking World easy, with substantial robustness in harder variants. The approach also improves interpretability and correctability by constraining the planning task to a deterministically solvable PF, albeit at the cost of slower PDDL generation and a requirement for domain-file annotations and sub-goal structures.

Abstract

Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner. However, existing methods rely on a fully-observed environment where all entity states are initially known, so a one-off representation can be constructed, leading to a complete plan. In contrast, we tackle partially-observed environments where there is initially no sufficient information to plan for the end-goal. We propose PDDLEGO that iteratively construct a planning representation that can lead to a partial plan for a given sub-goal. By accomplishing the sub-goal, more information is acquired to augment the representation, eventually achieving the end-goal. We show that plans produced by few-shot PDDLEGO are 43% more efficient than generating plans end-to-end on the Coin Collector simulation, with strong performance (98%) on the more complex Cooking World simulation where end-to-end LLMs fail to generate coherent plans (4%).
Paper Structure (11 sections, 6 equations, 8 figures, 2 tables)

This paper contains 11 sections, 6 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: A fully-observed environment like BlocksWorld (upper, to rearrange objects from and to a given configuration) can be tackled by generating a PDDL problem file, while a partially observed one like Coin Collector (lower, to look for an object in an unknown location) cannot until sufficient exploration.
  • Figure 2: The pipeline of pddlego. A PDDL problem file is iteratively built during exploration.
  • Figure 3: On Coin Collector, the mean and standard deviation of number of steps to success (less is better) for each development example, each over 5 trials with different random seeds of gpt-4-1106-preview, comparing Action-gen and PDDL-edit. The error bar represents the sample standard deviation. On example 0 and 6, PDDL-edit fails and thus not shown.
  • Figure 4: A PDDL solver produces a plan based on a minimal domain file and problem file. Previous work assumes the domain file as given, while we predict the action definitions in the domain file.
  • Figure 5: Annotated domain file for Coin Collector.
  • ...and 3 more figures