Table of Contents
Fetching ...

IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning

Ryan Hoque, Ajay Mandlekar, Caelan Garrett, Ken Goldberg, Dieter Fox

TL;DR

IntervenGen (I-Gen) addresses distribution shift in robot imitation learning by autonomously generating a large, diverse set of corrective interventional data from a small number of human interventions. The framework combines closed-loop mistake generation with open-loop recovery replay to synthesize interventional trajectories, greatly expanding state-space coverage while reducing human labeling burden. Evaluations across four simulated tasks and one physical task show up to 39x robustness gains with only 10 interventions, and strong sim-to-real transfer capabilities, highlighting significant improvements in data efficiency and robustness for high-precision manipulation under perception noise.

Abstract

Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective interventions during policy rollouts. However, collecting a sufficient amount of interventions to cover the distribution of policy mistakes can be burdensome for human operators. We propose IntervenGen (I-Gen), a novel data generation system that can autonomously produce a large set of corrective interventions with rich coverage of the state space from a small number of human interventions. We apply I-Gen to 4 simulated environments and 1 physical environment with object pose estimation error and show that it can increase policy robustness by up to 39x with only 10 human interventions. Videos and more results are available at https://sites.google.com/view/intervengen2024.

IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning

TL;DR

IntervenGen (I-Gen) addresses distribution shift in robot imitation learning by autonomously generating a large, diverse set of corrective interventional data from a small number of human interventions. The framework combines closed-loop mistake generation with open-loop recovery replay to synthesize interventional trajectories, greatly expanding state-space coverage while reducing human labeling burden. Evaluations across four simulated tasks and one physical task show up to 39x robustness gains with only 10 interventions, and strong sim-to-real transfer capabilities, highlighting significant improvements in data efficiency and robustness for high-precision manipulation under perception noise.

Abstract

Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective interventions during policy rollouts. However, collecting a sufficient amount of interventions to cover the distribution of policy mistakes can be burdensome for human operators. We propose IntervenGen (I-Gen), a novel data generation system that can autonomously produce a large set of corrective interventions with rich coverage of the state space from a small number of human interventions. We apply I-Gen to 4 simulated environments and 1 physical environment with object pose estimation error and show that it can increase policy robustness by up to 39x with only 10 human interventions. Videos and more results are available at https://sites.google.com/view/intervengen2024.
Paper Structure (15 sections, 4 figures, 4 tables, 1 algorithm)

This paper contains 15 sections, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview. IntervenGen automatically generates corrective interventional data from a small number of human interventions, with coverage across both diverse scene configurations and policy mistake distributions. Here, the robot mistakenly believes the peg is at the position highlighted in red and requires demonstration of recovery behavior toward the true peg position.
  • Figure 2: I-Gen Data Generation Example. We provide an example of how I-Gen generates a new intervention. First, a new task instance is sampled with a new configuration (square peg location) and observation corruption (incorrect peg location highlighted in red). We execute the robot policy to generate mistake behavior for the new task instance. When a mistake is detected, we sample a human intervention segment from the source dataset and transform it to adapt to the current scene. Finally, we executed the transformed recovery segment in the environment.
  • Figure 3: Tasks. We evaluate I-Gen in several contact-rich, high-precision tasks. The top row shows normal task execution while the bottom row shows typical mistakes encountered by the agent when using inaccurate object poses (or object geometry for Nut-and-Peg Assembly) and associated recovery behaviors.
  • Figure 4: Sim-to-Real. We evaluate sim-to-real transfer for a block grasping task with a Franka Panda robot. Similar to Figure \ref{['fig:tasks']} we show normal task execution, typical mistakes due to inaccurate object poses, and associated recovery for the simulation and real world environments. The results show that I-Gen can facilitate sim-to-real transfer of learned control policies, and that these policies retain robustness to erroneous perception.