Bootstrapping Object-level Planning with Large Language Models
David Paulius, Alejandro Agostini, Benedict Quartey, George Konidaris
TL;DR
This work tackles the gap between natural language understanding and robot task execution by introducing object-level planning that leverages FOON to bootstrap task and motion planning. An LLM is prompted to produce object-level plan sketches, which are transformed into FOON graphs and grounded into PDDL subgoals, enabling sound and complete task planning while outsourcing low-level feasibility to motion planning. Empirical results in simulation show that the proposed OLP approach outperforms direct LLM planning and PDDL-generation baselines in plan completion and robustness, while requiring fewer prompting resources than some baselines. The method advances practical robotics planning by aligning the expressive knowledge in language with object-level representations and hierarchical planning, and points to future work in broader task regimes and human-in-the-loop corrections.
Abstract
We introduce a new method that extracts knowledge from a large language model (LLM) to produce object-level plans, which describe high-level changes to object state, and uses them to bootstrap task and motion planning (TAMP). Existing work uses LLMs to directly output task plans or generate goals in representations like PDDL. However, these methods fall short because they rely on the LLM to do the actual planning or output a hard-to-satisfy goal. Our approach instead extracts knowledge from an LLM in the form of plan schemas as an object-level representation called functional object-oriented networks (FOON), from which we automatically generate PDDL subgoals. Our method markedly outperforms alternative planning strategies in completing several pick-and-place tasks in simulation.
