Table of Contents
Fetching ...

One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

Jinbang Huang, Yixin Xiao, Zhanguang Zhang, Mark Coates, Jianye Hao, Yingxue Zhang

TL;DR

PDDLLM addresses the challenge of long-horizon robotic planning by deriving symbolic planning domains from a single demonstration using LLM reasoning and physics-based simulation. It automatically generates predicates and actions without manual domain initialization, then interfaces with low-level motion planners via LoCA to execute plans. Evaluated on 1,200 tasks across nine environments and deployed on multiple real robots, PDDLLM outperforms six LLM-based baselines, reduces token costs, and approaches expert-designed domain quality. This work significantly reduces human effort in domain engineering and enables scalable, robust TAMP for real-world robotics.

Abstract

Pre-trained large language models (LLMs) show promise for robotic task planning but often struggle to guarantee correctness in long-horizon problems. Task and motion planning (TAMP) addresses this by grounding symbolic plans in low-level execution, yet it relies heavily on manually engineered planning domains. To improve long-horizon planning reliability and reduce human intervention, we present Planning Domain Derivation with LLMs (PDDLLM), a framework that automatically induces symbolic predicates and actions directly from demonstration trajectories by combining LLM reasoning with physical simulation roll-outs. Unlike prior domain-inference methods that rely on partially predefined or language descriptions of planning domains, PDDLLM constructs domains without manual domain initialization and automatically integrates them with motion planners to produce executable plans, enhancing long-horizon planning automation. Across 1,200 tasks in nine environments, PDDLLM outperforms six LLM-based planning baselines, achieving at least 20\% higher success rates, reduced token costs, and successful deployment on multiple physical robot platforms.

One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

TL;DR

PDDLLM addresses the challenge of long-horizon robotic planning by deriving symbolic planning domains from a single demonstration using LLM reasoning and physics-based simulation. It automatically generates predicates and actions without manual domain initialization, then interfaces with low-level motion planners via LoCA to execute plans. Evaluated on 1,200 tasks across nine environments and deployed on multiple real robots, PDDLLM outperforms six LLM-based baselines, reduces token costs, and approaches expert-designed domain quality. This work significantly reduces human effort in domain engineering and enables scalable, robust TAMP for real-world robotics.

Abstract

Pre-trained large language models (LLMs) show promise for robotic task planning but often struggle to guarantee correctness in long-horizon problems. Task and motion planning (TAMP) addresses this by grounding symbolic plans in low-level execution, yet it relies heavily on manually engineered planning domains. To improve long-horizon planning reliability and reduce human intervention, we present Planning Domain Derivation with LLMs (PDDLLM), a framework that automatically induces symbolic predicates and actions directly from demonstration trajectories by combining LLM reasoning with physical simulation roll-outs. Unlike prior domain-inference methods that rely on partially predefined or language descriptions of planning domains, PDDLLM constructs domains without manual domain initialization and automatically integrates them with motion planners to produce executable plans, enhancing long-horizon planning automation. Across 1,200 tasks in nine environments, PDDLLM outperforms six LLM-based planning baselines, achieving at least 20\% higher success rates, reduced token costs, and successful deployment on multiple physical robot platforms.

Paper Structure

This paper contains 77 sections, 2 equations, 19 figures, 19 tables.

Figures (19)

  • Figure 1: Overview of the proposed framework. (1) Human demonstrations, in the form of manipulation trajectories, and the corresponding task descriptions, serve as input. Implementation details is shown in \ref{['Demonstration']}. (2) PDDLLM initiates thousands of parallel simulations, using the resulting roll-outs and rich physics-based feedback to guide the LLM in summarizing them into meaningful predicates, and returns a predicate library annotated with each predicate’s relevance to the current task. (3) Actions are invented by an LLM that summarizes logical state transition patterns from the demonstration, which is grounded into logical states using the imagined predicates. (4) The predicates and actions are compiled into a planning domain, which is automatically interfaced with motion planning algorithm by the Logical Constrain Adapter (LoCA) to solve new tasks.
  • Figure 2: a. This example illustrates the imagination of predicates for relative object positions. Let $u$ be a configurable variable for each dimension. Object poses are sampled and simulated, with infeasible cases filtered out by the simulation feedback. Feasible subspaces are provided to the LLM to generate first-order predicates with their corresponding physical constraints. Higher-order predicates can be further derived using logical operators (e.g., "not", "for all") from first-order predicates. Diverse predicate examples are provided in \ref{['sec:Example of PDDLLM imagined predicates']}b. This example shows how the Stack action is invented. Continuous states are grounded into logical states using the imagined predicates, where the state transition represents the logical action. By prompting the LLM with the pair of the current state and the next state after the transition, we obtain the PDDL definition of the action Stack. c. The integration of actions with the motion planner is handled automatically by LoCA, which retrieves the physical constraints associated with each first-order predicate in the action effect set $\mathcal{P}_{eff}$ and applies these constraints for motion planning.
  • Figure 3: (left) Planning success rate trend across increasing object counts. (right) Overall planning success rate under varying time limits.
  • Figure 4: Real-robot experiment in three different platforms
  • Figure 5: Franka Panda Arm building a bridge
  • ...and 14 more figures

Theorems & Definitions (1)

  • Definition 1: Feature Space