Table of Contents
Fetching ...

DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

Yuchen Liu, Luigi Palmieri, Sebastian Koch, Ilche Georgievski, Marco Aiello

TL;DR

DELTA addresses the challenge of long-horizon robot task planning with large language models by grounding LLM outputs in 3D scene graphs and decomposing long-term goals into sub-goals. The method generates formal domain and problem descriptions in PDDL, prunes scene graphs to essential items, and solves sub-problems autoregressively with an automated planner, concatenating executable sub-plans. Across five domains and multiple scenes, DELTA achieves higher success rates and orders-of-magnitude faster planning times than four strong baselines, demonstrating the value of environment-grounded LLM planning and goal decomposition. The approach offers a scalable, automatic planning pipeline with strong generalization, and points to future work on handling dynamic uncertainties and real-world validation.

Abstract

Recent advancements in Large Language Models (LLMs) have sparked a revolution across many research fields. In robotics, the integration of common-sense knowledge from LLMs into task and motion planning has drastically advanced the field by unlocking unprecedented levels of context awareness. Despite their vast collection of knowledge, large language models may generate infeasible plans due to hallucinations or missing domain information. To address these challenges and improve plan feasibility and computational efficiency, we introduce DELTA, a novel LLM-informed task planning approach. By using scene graphs as environment representations within LLMs, DELTA achieves rapid generation of precise planning problem descriptions. To enhance planning performance, DELTA decomposes long-term task goals with LLMs into an autoregressive sequence of sub-goals, enabling automated task planners to efficiently solve complex problems. In our extensive evaluation, we show that DELTA enables an efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art. Project webpage: https://delta-llm.github.io/

DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

TL;DR

DELTA addresses the challenge of long-horizon robot task planning with large language models by grounding LLM outputs in 3D scene graphs and decomposing long-term goals into sub-goals. The method generates formal domain and problem descriptions in PDDL, prunes scene graphs to essential items, and solves sub-problems autoregressively with an automated planner, concatenating executable sub-plans. Across five domains and multiple scenes, DELTA achieves higher success rates and orders-of-magnitude faster planning times than four strong baselines, demonstrating the value of environment-grounded LLM planning and goal decomposition. The approach offers a scalable, automatic planning pipeline with strong generalization, and points to future work on handling dynamic uncertainties and real-world validation.

Abstract

Recent advancements in Large Language Models (LLMs) have sparked a revolution across many research fields. In robotics, the integration of common-sense knowledge from LLMs into task and motion planning has drastically advanced the field by unlocking unprecedented levels of context awareness. Despite their vast collection of knowledge, large language models may generate infeasible plans due to hallucinations or missing domain information. To address these challenges and improve plan feasibility and computational efficiency, we introduce DELTA, a novel LLM-informed task planning approach. By using scene graphs as environment representations within LLMs, DELTA achieves rapid generation of precise planning problem descriptions. To enhance planning performance, DELTA decomposes long-term task goals with LLMs into an autoregressive sequence of sub-goals, enabling automated task planners to efficiently solve complex problems. In our extensive evaluation, we show that DELTA enables an efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art. Project webpage: https://delta-llm.github.io/
Paper Structure (20 sections, 4 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: An example of the long-term task decomposition. A Scene Graph (SG) is pre-built from the environment Armeni2019hierachical. Using the SG as the environment representation, a human user queries a LLM with goal descriptions to extract the relevant items and decompose the goal into multiple sub-goals. An automated task planner generates a task plan with respect to the sub-goals for the robot to execute.
  • Figure 2: Shelbiana scene Armeni2019hierachical and the corresponding SG with floor, room, and item node layers. The edges refer to the semantic relationships. Not all item nodes are visualized.
  • Figure 3: The system architecture of DELTA with five steps: Domain Generation, Scene Graph Pruning, Problem Generation, Goal Decomposition, and Autoregressive Sub-Task Planning.
  • Figure 4: Failure analysis of DELTA with GPT-4o. Each step, success, and failure type is annotated with the number of trials. Problem Generation and Goal Decomposition are decoupled since the planning of original and decomposed problems are independent and executed parallelly with the same number of trials outgoing from Scene Graph Pruning step.