Table of Contents
Fetching ...

LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

TL;DR

The paper targets the coupling challenge in task and motion planning by introducing LLM3, a framework that uses a pre-trained large language model as a domain-independent planner, action-parameter sampler, and motion-failure reasoner. LLM3 iteratively weighs motion planning feedback, represented as categorized failures (collisions and unreachability), to refine symbolic action sequences and continuous parameters. Through simulations in a box-packing domain and real-robot experiments, the authors show that motion-feedback-guided planning reduces planning iterations and motion-planning calls, with ablations highlighting the importance of failure reasoning. The work suggests that LLM-based interfaces can generalize TAMP across domains, offering a promising path toward flexible, real-world robotic manipulation without manually engineered interfaces.

Abstract

Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feedback through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain-specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.

LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

TL;DR

The paper targets the coupling challenge in task and motion planning by introducing LLM3, a framework that uses a pre-trained large language model as a domain-independent planner, action-parameter sampler, and motion-failure reasoner. LLM3 iteratively weighs motion planning feedback, represented as categorized failures (collisions and unreachability), to refine symbolic action sequences and continuous parameters. Through simulations in a box-packing domain and real-robot experiments, the authors show that motion-feedback-guided planning reduces planning iterations and motion-planning calls, with ablations highlighting the importance of failure reasoning. The work suggests that LLM-based interfaces can generalize TAMP across domains, offering a promising path toward flexible, real-world robotic manipulation without manually engineered interfaces.

Abstract

Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feedback through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain-specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.
Paper Structure (15 sections, 1 equation, 6 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 1 equation, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: The proposed LLM3 framework. (a) Traditional tamp frameworks rely on manually designed, domain-specific modules for interfacing between task and motion planners. (b) In contrast, we leverage a pre-trained llm to iteratively propose refined plans and action parameters, by reasoning on motion planning failures.
  • Figure 2: System diagram of the proposed LLM3 framework. (a) We show an example of utilizing a pre-trained llm for reasoning and generating action sequences. (b) The feasibility of the proposed action sequence is verified by rollout with a motion planner and transition function $\mathcal{T}$. The motion planning feedback is saved into a trace that is provided to the llm in the next iteration.
  • Figure 3: Prompt templetes used by LLM3. We show alternative contents specific for the backtrack variant in orange and from scratch variant in blue .
  • Figure 4: Three types of motion possibilities. A: the object placement is in collision with an existing object. B: the object placement is beyond the robot's reach. C: the object placement is feasible.
  • Figure 5: The box-packing task setup in a simulated environment. (a) The task requires the robot to place one of (b) three sets of objects fully into the basket. (c) In setting 1, the total object size increases but the basket sizes remain the same. All baskets are fully reachable by the robot. (d) In setting 2, the basket size increases, but some portions of baskets are longer within the robot's reach.
  • ...and 1 more figures