Language models are robotic planners: reframing plans as goal refinement graphs
Ateeq Sharfuddin, Travis Breaux
TL;DR
This work addresses the challenge of translating high-level robotic goals into executable plans by applying requirements-engineering techniques to form goal refinement graphs. The authors encode these graphs as code fragments and prompt LLMs to complete them, evaluating executability and correctness with a longest common subsequence metric against human-written programs. On a curated set of 20 VirtualHome-based tasks, GPT-4 variants achieve up to 94% alignment with human plans, outperforming GPT-3.5, and demonstrating improved robustness in generating actionable robot instructions. The approach offers a scalable pathway to leverage world-knowledge in LLMs for embodied planning, with implications for broader robotic task automation and future extensions to richer goal types and datasets.
Abstract
Successful application of large language models (LLMs) to robotic planning and execution may pave the way to automate numerous real-world tasks. Promising recent research has been conducted showing that the knowledge contained in LLMs can be utilized in making goal-driven decisions that are enactable in interactive, embodied environments. Nonetheless, there is a considerable drop in correctness of programs generated by LLMs. We apply goal modeling techniques from software engineering to large language models generating robotic plans. Specifically, the LLM is prompted to generate a step refinement graph for a task. The executability and correctness of the program converted from this refinement graph is then evaluated. The approach results in programs that are more correct as judged by humans in comparison to previous work.
