Collective Intelligence for 2D Push Manipulations with Mobile Robots

So Kuroki; Tatsuya Matsushima; Jumpei Arima; Hiroki Furuta; Yutaka Matsuo; Shixiang Shane Gu; Yujin Tang

Collective Intelligence for 2D Push Manipulations with Mobile Robots

So Kuroki, Tatsuya Matsushima, Jumpei Arima, Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu, Yujin Tang

TL;DR

This work shows that by distilling a planner derived from a differentiable soft-body physics simulator into an attention-based neural network, this multi-robot push manipulation system achieves better performance than baselines.

Abstract

While natural systems often present collective intelligence that allows them to self-organize and adapt to changes, the equivalent is missing in most artificial systems. We explore the possibility of such a system in the context of cooperative 2D push manipulations using mobile robots. Although conventional works demonstrate potential solutions for the problem in restricted settings, they have computational and learning difficulties. More importantly, these systems do not possess the ability to adapt when facing environmental changes. In this work, we show that by distilling a planner derived from a differentiable soft-body physics simulator into an attention-based neural network, our multi-robot push manipulation system achieves better performance than baselines. In addition, our system also generalizes to configurations not seen during training and is able to adapt toward task completions when external turbulence and environmental changes are applied. Supplementary videos can be found on our project website: https://sites.google.com/view/ciom/home

Collective Intelligence for 2D Push Manipulations with Mobile Robots

TL;DR

Abstract

Paper Structure (17 sections, 3 equations, 7 figures, 4 tables)

This paper contains 17 sections, 3 equations, 7 figures, 4 tables.

Introduction
Related Work
Multi-Robot Push Manipulations
Collective Intelligence in Machine Learning and Robotics
Differentiable Physics Engines
Multi-Head Self-Attention
Proposed Method
Problem Statement
Gradient-based Motion Planning
Distilling the Planner
Real-world Policy Deployment
EXPERIMENTS
Experimental Setup
Evaluating the Gradient-based Motion Planner
Evaluating the Distilled Policy
...and 2 more sections

Figures (7)

Figure 1: Our robots trained in simulation (top left) generalize to the real world tests (top right). They even learned to adapt to the change when one of the robots was pulled away during task execution (bottom left). In this case, a neighboring robot quickly compensated for the missing robot and worked toward the completion of the task (bottom right). In all figures, the blue lines describe the goal poses for the rope. In the bottom row, we highlight the robots pertaining to the turbulence and the adaptation, and superpose the state from the previous time step translucently to illustrate the changes.
Figure 2: An overview of our method. (a) In this phase, we focus on designing the data interfaces and evaluating a derived gradient-based motion planner. $s_t$ and $s_g$ represent the current and the goal states of the target object, a-d are four input components considered by each robot (see Section \ref{['sec:distillation']} for details). (b) We collect data from the planner and distill the behavior into an Attention-based neural network policy. The data dimension $d_{in}=d_a+d_b+d_c+d_d$ is an aggregation of the four input parts described in Section \ref{['sec:distillation']}. (c) We zero-shot transfer the distilled policy onto real robots and conduct tests, including settings that were not seen during training.
Figure 3: Illustration of our real-world experimental setup. We define the arena area and acquire the positions of the robots via the ArUco markers on the floor and on the Roombas.
Figure 4: Comparison of $r(s_T)$ in rope manipulation. For comparison, we fixed the number of the robot at $N_r=6$. (a) The mean reward and standard deviation for each episode of 5 seeds with the same task. (b) The mean reward and standard deviation for each episode of 5 tasks with the same seed.
Figure 5: Applying the GMP directly in the real-world. On the left, the robot pushes a box to a specified pose configuration (the blue bounded area). On the right, two robots twist a rope into a required shape. Followinghuang2021plasticinelab, we employ a fixed environmental object as an anchor in the middle.
...and 2 more figures

Collective Intelligence for 2D Push Manipulations with Mobile Robots

TL;DR

Abstract

Collective Intelligence for 2D Push Manipulations with Mobile Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (7)