Table of Contents
Fetching ...

Supervision via Competition: Robot Adversaries for Learning Tasks

Lerrel Pinto, James Davidson, Abhinav Gupta

TL;DR

The paper introduces an adversarial self-supervised learning framework for robotics, where a grasping protagonist competes against an antagonist (shake or snatch) to create harder, more informative training signals. By jointly optimizing a grasping policy and an adversarial policy, the approach yields significantly more robust grasps on novel objects and demonstrates improved data efficiency compared to a data-heavy baseline. The method uses an AlexNet-like ConvNet with patch-based grasp prediction and two discrete adversaries, validated on a Baxter robot with both shaking and snatching disruptions. The results show substantial gains in grasp success rates (up to 82% with adversarial training) and suggest that adversarial supervision can outperform simple data augmentation or multi-robot collaboration in self-supervised learning.

Abstract

There has been a recent paradigm shift in robotics to data-driven learning for planning and control. Due to large number of experiences required for training, most of these approaches use a self-supervised paradigm: using sensors to measure success/failure. However, in most cases, these sensors provide weak supervision at best. In this work, we propose an adversarial learning framework that pits an adversary against the robot learning the task. In an effort to defeat the adversary, the original robot learns to perform the task with more robustness leading to overall improved performance. We show that this adversarial framework forces the the robot to learn a better grasping model in order to overcome the adversary. By grasping 82% of presented novel objects compared to 68% without an adversary, we demonstrate the utility of creating adversaries. We also demonstrate via experiments that having robots in adversarial setting might be a better learning strategy as compared to having collaborative multiple robots.

Supervision via Competition: Robot Adversaries for Learning Tasks

TL;DR

The paper introduces an adversarial self-supervised learning framework for robotics, where a grasping protagonist competes against an antagonist (shake or snatch) to create harder, more informative training signals. By jointly optimizing a grasping policy and an adversarial policy, the approach yields significantly more robust grasps on novel objects and demonstrates improved data efficiency compared to a data-heavy baseline. The method uses an AlexNet-like ConvNet with patch-based grasp prediction and two discrete adversaries, validated on a Baxter robot with both shaking and snatching disruptions. The results show substantial gains in grasp success rates (up to 82% with adversarial training) and suggest that adversarial supervision can outperform simple data augmentation or multi-robot collaboration in self-supervised learning.

Abstract

There has been a recent paradigm shift in robotics to data-driven learning for planning and control. Due to large number of experiences required for training, most of these approaches use a self-supervised paradigm: using sensors to measure success/failure. However, in most cases, these sensors provide weak supervision at best. In this work, we propose an adversarial learning framework that pits an adversary against the robot learning the task. In an effort to defeat the adversary, the original robot learns to perform the task with more robustness leading to overall improved performance. We show that this adversarial framework forces the the robot to learn a better grasping model in order to overcome the adversary. By grasping 82% of presented novel objects compared to 68% without an adversary, we demonstrate the utility of creating adversaries. We also demonstrate via experiments that having robots in adversarial setting might be a better learning strategy as compared to having collaborative multiple robots.

Paper Structure

This paper contains 19 sections, 5 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: We propose an adversarial framework for effective self-supervised learning. In our framework, the protagonist attempts to learn policy for a task such as grasping. While an adversary learns the task to make the protagonist fail at its task. For example, in the figure above, adversary tries to snatch the object from protagonist. Both the policies are learned simultaneously leading to robust learning of protagonist.
  • Figure 2: (first row) Examples of successful yet unstable grasps. (second row) Examples of stable and successful grasps.
  • Figure 3: Given a weak grasp, an adversary can destabilize in multiple ways. Left shows the motion of a linear shake on the same arm that could destabilize this grasp. Another way to destabilize this is a push grasp on this object by a different arm and then pull. Center shows hows how snatching/pulling can destabilize the grasp, while right shows how the pushing motion can destabilize the grasp.
  • Figure 4: The shake space contains 15 discrete actions, with 3 directions of linear shake for 5 different configurations. These 5 configurations are seen as individual images with the 3 directions of shake shown by white arrows.
  • Figure 5: The network used by both the protagonist and the antagonist is modeled after AlexNet krizhevsky2012imagenet. The output of this network is scaled by using a sigmoidal function.
  • ...and 6 more figures