Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

Coline Devin; Abhishek Gupta; Trevor Darrell; Pieter Abbeel; Sergey Levine

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

TL;DR

<3-5 sentence high-level summary> The paper tackles data inefficiency in deep RL for robotic skills by proposing modular policy networks that partition policy representations into task-specific and robot-specific modules. This decomposition enables cross-robot and cross-task transfer, including zero-shot generalization to unseen robot-task combinations, by recombining pretrained modules. Trained under a GPS framework across multiple simulated robots and tasks, the approach demonstrates zero-shot transfer in visuomotor and manipulation tasks and accelerates learning for held-out combinations. The work highlights regularization to enforce interface invariance and discusses future directions toward lifelong, scalable, multi-robot transfer with larger repertoires.

Abstract

Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into "task-specific" and "robot-specific" modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel neural network architecture, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

TL;DR

Abstract

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)