Learning Dual-arm Object Rearrangement for Cartesian Robots
Shishun Zhang, Qijin She, Wenhao Li, Chenyang Zhu, Yongjun Wang, Ruizhen Hu, Kai Xu
TL;DR
The paper addresses dual-arm object rearrangement for Cartesian robots with the goal of minimizing makespan under realistic motion interference. It proposes an online reinforcement-learning framework that employs an attention-based neural network to learn object-to-arm assignments and a higher-level policy paired with a collision-aware lower-level motion planner. Key contributions include (1) a scalable online RL method for task assignment, (2) an attention mechanism capturing state dependencies to improve long-horizon decisions, and (3) a practical lower-level planner validated in simulation and on a real ROS-based system, with evidence of improved makespan and linear-time growth relative to object count. The work has practical impact for industrial robotics by enabling efficient, scalable dual-arm coordination in rearrangement tasks while accommodating motion constraints.
Abstract
This work focuses on the dual-arm object rearrangement problem abstracted from a realistic industrial scenario of Cartesian robots. The goal of this problem is to transfer all the objects from sources to targets with the minimum total completion time. To achieve the goal, the core idea is to develop an effective object-to-arm task assignment strategy for minimizing the cumulative task execution time and maximizing the dual-arm cooperation efficiency. One of the difficulties in the task assignment is the scalability problem. As the number of objects increases, the computation time of traditional offline-search-based methods grows strongly for computational complexity. Encouraged by the adaptability of reinforcement learning (RL) in long-sequence task decisions, we propose an online task assignment decision method based on RL, and the computation time of our method only increases linearly with the number of objects. Further, we design an attention-based network to model the dependencies between the input states during the whole task execution process to help find the most reasonable object-to-arm correspondence in each task assignment round. In the experimental part, we adapt some search-based methods to this specific setting and compare our method with them. Experimental result shows that our approach achieves outperformance over search-based methods in total execution time and computational efficiency, and also verifies the generalization of our method to different numbers of objects. In addition, we show the effectiveness of our method deployed on the real robot in the supplementary video.
