MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

Yike Wu; Jiatao Zhang; Nan Hu; LanLing Tang; Guilin Qi; Jun Shao; Jie Ren; Wei Song

MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song

TL;DR

Open-source LLMs struggle with complex long-horizon robotic task planning due to limited context and reasoning. MLDT addresses this by three-level decomposition ($G$-, $t$-, and $a$-level) to simplify planning, coupled with goal-sensitive corpus generation and instruction tuning, plus a new LongTasks benchmark for stress-testing. Empirical results in VirtualHome show MLDT achieving higher success rates across multiple open-source LLMs and substantially outperforming baselines on LongTasks, validating the approach for long-horizon planning. The work demonstrates that structured decomposition and targeted data adaptation can unlock practical planning capabilities in resource-constrained LLMs and enable real-world robotic applications.

Abstract

In the realm of data-driven AI technology, the application of open-source large language models (LLMs) in robotic task planning represents a significant milestone. Recent robotic task planning methods based on open-source LLMs typically leverage vast task planning datasets to enhance models' planning abilities. While these methods show promise, they struggle with complex long-horizon tasks, which require comprehending more context and generating longer action sequences. This paper addresses this limitation by proposing MLDT, theMulti-Level Decomposition Task planning method. This method innovatively decomposes tasks at the goal-level, task-level, and action-level to mitigate the challenge of complex long-horizon tasks. In order to enhance open-source LLMs' planning abilities, we introduce a goal-sensitive corpus generation method to create high-quality training data and conduct instruction tuning on the generated corpus. Since the complexity of the existing datasets is not high enough, we construct a more challenging dataset, LongTasks, to specifically evaluate planning ability on complex long-horizon tasks. We evaluate our method using various LLMs on four datasets in VirtualHome. Our results demonstrate a significant performance enhancement in robotic task planning, showcasing MLDT's effectiveness in overcoming the limitations of existing methods based on open-source LLMs as well as its practicality in complex, real-world scenarios.

MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

TL;DR

Open-source LLMs struggle with complex long-horizon robotic task planning due to limited context and reasoning. MLDT addresses this by three-level decomposition (

-, and

-level) to simplify planning, coupled with goal-sensitive corpus generation and instruction tuning, plus a new LongTasks benchmark for stress-testing. Empirical results in VirtualHome show MLDT achieving higher success rates across multiple open-source LLMs and substantially outperforming baselines on LongTasks, validating the approach for long-horizon planning. The work demonstrates that structured decomposition and targeted data adaptation can unlock practical planning capabilities in resource-constrained LLMs and enable real-world robotic applications.

Abstract

Paper Structure (20 sections, 3 equations, 10 figures, 3 tables)

This paper contains 20 sections, 3 equations, 10 figures, 3 tables.

Introduction
Related Works
Large Language Models for Robotic Task Planning
Complex Long-Horizon Robotic Task Planning
Preliminaries
Methods
Multi-Level Decomposition Task Planning Method
Goal-Level Decomposition
Task-Level Decomposition
Action-Level Decomposition
Instruction Tuning for Robotic Task Planning
Goal-sensitive Corpus Generation
Instruction Tuning for LLMs
LongTasks Dataset Construction
Experiments
...and 5 more sections

Figures (10)

Figure 1: The comparison between regular task planning and complex long-horizon task planning.
Figure 2: The overview of our proposed method, MLDT.
Figure 3: We design a programmatic prompt and generate reasoning traces and actions in an interleaved manner.
Figure 4: We devise a goal-sensitive corpus generation method to construct a training corpus used for instruction tuning.
Figure 5: The construction of LongTasks consists of three steps: object extraction, goal construction, and complexity verification.
...and 5 more figures

MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

TL;DR

Abstract

MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

Authors

TL;DR

Abstract

Table of Contents

Figures (10)