Table of Contents
Fetching ...

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

Lars Ankile, Anthony Simeonov, Idan Shenfeld, Pulkit Agrawal

TL;DR

A pipeline for improving imitation learning performance with a small human demonstration budget is proposed, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines.

Abstract

While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation. This paper proposes a pipeline for improving imitation learning performance with a small human demonstration budget. We apply our approach to assembly tasks that require precisely grasping, reorienting, and inserting multiple parts over long horizons and multiple task phases. Our pipeline combines expressive policy architectures and various techniques for dataset expansion and simulation-based data augmentation. These help expand dataset support and supervise the model with locally corrective actions near bottleneck regions requiring high precision. We demonstrate our pipeline on four furniture assembly tasks in simulation, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines. Project website: https://imitation-juicer.github.io/.

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

TL;DR

A pipeline for improving imitation learning performance with a small human demonstration budget is proposed, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines.

Abstract

While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation. This paper proposes a pipeline for improving imitation learning performance with a small human demonstration budget. We apply our approach to assembly tasks that require precisely grasping, reorienting, and inserting multiple parts over long horizons and multiple task phases. Our pipeline combines expressive policy architectures and various techniques for dataset expansion and simulation-based data augmentation. These help expand dataset support and supervise the model with locally corrective actions near bottleneck regions requiring high precision. We demonstrate our pipeline on four furniture assembly tasks in simulation, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines. Project website: https://imitation-juicer.github.io/.
Paper Structure (58 sections, 5 equations, 9 figures, 8 tables, 1 algorithm)

This paper contains 58 sections, 5 equations, 9 figures, 8 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the proposed approach. (1) Collect a small number of demonstrations for the task (and related tasks, if available). (2) Using annotations of bottleneck states, augment the demonstration trajectories to create synthetic corrective actions and increase coverage around the bottleneck states. (3) Use this dataset to train models with different hyperparameters. (4) Store all rollouts throughout model evaluations. (5) Add any successful rollout to the training set and train the best architecture on all data amassed.
  • Figure 2: An example of extracting "bottleneck" states and using trajectory augmentation tool to create an arbitrary amount of counterfactual trajectories near the bottleneck, effectively increasing the data support and teaching corrective actions.
  • Figure 3: Overview of the tasks. The first row shows a random initialization of the parts, and the second row shows the full assembly.
  • Figure 4: Average and maximum success rates (%) of methods across tasks. Bolded methods are components of our JUICER pipeline.
  • Figure 5: Learning one_leg from 10 human demos.
  • ...and 4 more figures