Bootstrapping Object-level Planning with Large Language Models

David Paulius; Alejandro Agostini; Benedict Quartey; George Konidaris

Bootstrapping Object-level Planning with Large Language Models

David Paulius, Alejandro Agostini, Benedict Quartey, George Konidaris

TL;DR

This work tackles the gap between natural language understanding and robot task execution by introducing object-level planning that leverages FOON to bootstrap task and motion planning. An LLM is prompted to produce object-level plan sketches, which are transformed into FOON graphs and grounded into PDDL subgoals, enabling sound and complete task planning while outsourcing low-level feasibility to motion planning. Empirical results in simulation show that the proposed OLP approach outperforms direct LLM planning and PDDL-generation baselines in plan completion and robustness, while requiring fewer prompting resources than some baselines. The method advances practical robotics planning by aligning the expressive knowledge in language with object-level representations and hierarchical planning, and points to future work in broader task regimes and human-in-the-loop corrections.

Abstract

We introduce a new method that extracts knowledge from a large language model (LLM) to produce object-level plans, which describe high-level changes to object state, and uses them to bootstrap task and motion planning (TAMP). Existing work uses LLMs to directly output task plans or generate goals in representations like PDDL. However, these methods fall short because they rely on the LLM to do the actual planning or output a hard-to-satisfy goal. Our approach instead extracts knowledge from an LLM in the form of plan schemas as an object-level representation called functional object-oriented networks (FOON), from which we automatically generate PDDL subgoals. Our method markedly outperforms alternative planning strategies in completing several pick-and-place tasks in simulation.

Bootstrapping Object-level Planning with Large Language Models

TL;DR

Abstract

Paper Structure (23 sections, 16 figures, 1 table)

This paper contains 23 sections, 16 figures, 1 table.

Introduction
Background
Related Work
Object-level Planning with Language Models
Object-level Planning
LLM Prompting to Object-level Plan
Bridging to Task and Motion Planning
Object-Level to Task-Level Planning
Task-Level to Motion-Level Planning
Evaluation
Experimental Setup
Baseline Methods
LLM-Planner
LLM+P
DELTA
...and 8 more sections

Figures (16)

Figure 1: Our approach prompts an LLM for object-level information with which we construct an object-level plan (as a FOON). This plan schema bootstraps task- and motion-level planning (TAMP) via PDDL subgoals.
Figure 2: Our approach interfaces with a language model to generate object-level plans (as FOON graphs) for bootstrapping task and motion planning. We generate task-level subgoals as PDDL subgoals by grounding object-level subgoals to the robot's environment; with these task-level definitions, task planning to obtains task plan segments per object-level action, which are executed using motion-level planning, improving prior work paulius2023longhorizon.
Figure 3: Illustration of how a user task specified in natural language is transformed into an object-level plan (OLP) as a FOON via LLM prompting.
Figure 4: Example of task-level grounding for an object-level plan (Figure \ref{['fig:llm-to-olp']}), which is compatible by design with the planning operators in Figure \ref{['fig:micro-PO']}.
Figure 5: Planning operators for pick and place actions using object-centered predicates agostini2020manipulation and executable via motion-level planning (Section \ref{['sec:foon-to-pddl-sense']}).
...and 11 more figures

Bootstrapping Object-level Planning with Large Language Models

TL;DR

Abstract

Bootstrapping Object-level Planning with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (16)