Table of Contents
Fetching ...

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

Zeyu Gao, Yao Mu, Jinye Qu, Mengkang Hu, Shijia Peng, Chengkai Hou, Lingyue Guo, Ping Luo, Shanghang Zhang, Yanfeng Lu

TL;DR

DAG-Plan tackles the challenge of long-horizon dual-arm task planning by replacing linear task sequences with a Directed Acyclic Graph (DAG) representation generated by LLMs. It dynamically assigns sub-tasks to the appropriate arm using real-time environment observations and a cost-based selection mechanism, enabling parallel execution and adaptive behavior. The framework introduces occupy-release DAGs, two candidate sets (common and priority), and a cost function to guide feasible pairings, with execution coded via four foundation-model skills and perceptual backends for pose estimation. Evaluations on the Dual-arm Kitchen Benchmark show DAG-Plan achieves higher efficiency and robustness than single-arm planning and other dual-arm baselines, including substantial gains in plan success rate and reduced token/query costs, while maintaining real-world viability. Overall, DAG-Plan demonstrates that DAG-based, environment-aware, cooperative planning can significantly advance the practicality of autonomous dual-arm manipulation in complex tasks.

Abstract

Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by enabling concurrent manipulation of multiple objects or cooperative execution of tasks using both arms. However, the coordination of dual-arm systems for long-horizon tasks continues to pose significant challenges, stemming from the intricate temporal and spatial dependencies among sub-tasks, necessitating intelligent decisions regarding the allocation of actions between arms and their optimal execution order. Existing task planning methods predominantly focus on single-arm robots or rely on predefined bimanual operations to use large language models (LLMs) generate task sequence with linear temporal dependency, failing to fully leverage the capabilities of dual-arm systems. To address this limitation, we introduce DAG-Plan, a structured task planning framework tailored for dual-arm robots. DAG-Plan harnesses LLMs to decompose intricate tasks into actionable sub-tasks represented as nodes within a directed acyclic graph (DAG). Critically, DAG-Plan dynamically assigns these sub-tasks to the appropriate arm based on real-time environmental observations, enabling parallel and adaptive execution. We evaluate DAG-Plan on the Dual-Arm Kitchen Benchmark, comprising 5 sequential tasks with 44 sub-tasks. Extensive experiments demonstrate the superiority of DAG-Plan over directly using LLM to generate linear task sequence, achieving 52.8% higher efficiency compared to the single-arm task planning and 48% higher success rate of the dual-arm task planning. Compared to iterative methods, DAG-Plan improving execution efficiency 84.1% due to its fewer query time. More demos and information are available on https://sites.google.com/view/dag-plan.

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

TL;DR

DAG-Plan tackles the challenge of long-horizon dual-arm task planning by replacing linear task sequences with a Directed Acyclic Graph (DAG) representation generated by LLMs. It dynamically assigns sub-tasks to the appropriate arm using real-time environment observations and a cost-based selection mechanism, enabling parallel execution and adaptive behavior. The framework introduces occupy-release DAGs, two candidate sets (common and priority), and a cost function to guide feasible pairings, with execution coded via four foundation-model skills and perceptual backends for pose estimation. Evaluations on the Dual-arm Kitchen Benchmark show DAG-Plan achieves higher efficiency and robustness than single-arm planning and other dual-arm baselines, including substantial gains in plan success rate and reduced token/query costs, while maintaining real-world viability. Overall, DAG-Plan demonstrates that DAG-based, environment-aware, cooperative planning can significantly advance the practicality of autonomous dual-arm manipulation in complex tasks.

Abstract

Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by enabling concurrent manipulation of multiple objects or cooperative execution of tasks using both arms. However, the coordination of dual-arm systems for long-horizon tasks continues to pose significant challenges, stemming from the intricate temporal and spatial dependencies among sub-tasks, necessitating intelligent decisions regarding the allocation of actions between arms and their optimal execution order. Existing task planning methods predominantly focus on single-arm robots or rely on predefined bimanual operations to use large language models (LLMs) generate task sequence with linear temporal dependency, failing to fully leverage the capabilities of dual-arm systems. To address this limitation, we introduce DAG-Plan, a structured task planning framework tailored for dual-arm robots. DAG-Plan harnesses LLMs to decompose intricate tasks into actionable sub-tasks represented as nodes within a directed acyclic graph (DAG). Critically, DAG-Plan dynamically assigns these sub-tasks to the appropriate arm based on real-time environmental observations, enabling parallel and adaptive execution. We evaluate DAG-Plan on the Dual-Arm Kitchen Benchmark, comprising 5 sequential tasks with 44 sub-tasks. Extensive experiments demonstrate the superiority of DAG-Plan over directly using LLM to generate linear task sequence, achieving 52.8% higher efficiency compared to the single-arm task planning and 48% higher success rate of the dual-arm task planning. Compared to iterative methods, DAG-Plan improving execution efficiency 84.1% due to its fewer query time. More demos and information are available on https://sites.google.com/view/dag-plan.
Paper Structure (22 sections, 1 equation, 6 figures, 3 tables)

This paper contains 22 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: An overview of DAG-Plan. The DAG-Plan generates a DAG based on human instruction and environmental description. It checks the graph's completeness and reflects the LLM to regenerate if incomplete. Once a valid DAG is obtained, DAG-Plan performs task inference to identify executable candidate nodes. The occupied arm and free arm are assigned priority candidate nodes and common candidate nodes respectively. The framework then evaluates all candidate combinations for feasibility and cost. DAG-Plan selects the nodes with the lowest cost and employs skill in library for execution. DAG-Plan updates the graph, iterating inference until the DAG is fully executed.
  • Figure 2: The process of Task Planning Inference. In task 2 "clean the table (Hard)", DAG-Plan initializes common candidate nodes based on the DAG. It evaluates node combinations, checks feasibility, and calculates costs. The right arm is selected to grasp apple and the left to open drawer. After execution, the task graph and nodes are updated, adding subsequent release nodes to the priority candidate nodes for right arm. In stage 2, right arm is assigned corresponding priority candidate nodes, checked and left arm still selected node in common candidate nodes with empty priority candidate nodes. The task graph and nodes are updated again. In stage 3, left arm put mug into drawer and right arm grasp lemon.
  • Figure 3: Snapshots of 5 Tasks of Dual-arm Kitchen Benchmark.
  • Figure 4: Simulation snapshots of the execution process of long-horizon task 2.
  • Figure 5: Real-world snapshots of the execution process of long-horizon tasks.
  • ...and 1 more figures