An Efficient Multi-Robot Arm Coordination Strategy for Pick-and-Place Tasks using Reinforcement Learning
Tizian Jermann, Hendrik Kolvenbach, Fidel Esquivel Estay, Koen Kramer, Marco Hutter
TL;DR
The paper tackles multi-robot waste sorting on a conveyor by learning a reinforcement-learning policy to allocate pick-and-place tasks, formulated in a custom OpenAI Gym environment and trained with Proximal Policy Optimization. It compares the RL approach to a combinatorial game theory baseline across diverse pattern distributions, demonstrating up to 16% higher picking rates and robust generalization to unseen scenarios. The work also validates the approach on a two-robot hardware setup, illustrating sim-to-real transfer and practical throughput improvements, with discussions on belt-speed implications and scalability to more agents. Overall, the study shows RL can flexibly and effectively optimize multi-robot coordination for high-throughput pick-and-place tasks in industrial-like sorting systems.
Abstract
We introduce a novel strategy for multi-robot sorting of waste objects using Reinforcement Learning. Our focus lies on finding optimal picking strategies that facilitate an effective coordination of a multi-robot system, subject to maximizing the waste removal potential. We realize this by formulating the sorting problem as an OpenAI gym environment and training a neural network with a deep reinforcement learning algorithm. The objective function is set up to optimize the picking rate of the robotic system. In simulation, we draw a performance comparison to an intuitive combinatorial game theory-based approach. We show that the trained policies outperform the latter and achieve up to 16% higher picking rates. Finally, the respective algorithms are validated on a hardware setup consisting of a two-robot sorting station able to process incoming waste objects through pick-and-place operations.
