PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning
Tian Gao, Soroush Nasiriany, Huihan Liu, Quantao Yang, Yuke Zhu
TL;DR
PRIME tackles the data inefficiency of imitation learning for long-horizon manipulation by scaffolding tasks with a fixed set of behavior primitives and learning a high-level primitive-sequencing policy. A self-supervised data collection regime trains an inverse dynamics model (IDM) to map state pairs to primitives, while a trajectory parser using dynamic programming converts demonstrations into primitive sequences without segmentation labels. The policy is learned through imitation on parsed sequences, aided by suffix-based data augmentation and pretraining on IDM data. In simulation and on real robots, PRIME achieves substantial performance gains over state-of-the-art baselines and demonstrates strong generalization and recovery capabilities, though real-world sim2real gaps remain a challenge. Overall, PRIME provides a practical, data-efficient framework for scalable, primitive-based imitation in tabletop manipulation, with promising directions for expanding its primitive library and applying curriculum learning.
Abstract
Imitation learning has shown great potential for enabling robots to acquire complex manipulation behaviors. However, these algorithms suffer from high sample complexity in long-horizon tasks, where compounding errors accumulate over the task horizons. We present PRIME (PRimitive-based IMitation with data Efficiency), a behavior primitive-based framework designed for improving the data efficiency of imitation learning. PRIME scaffolds robot tasks by decomposing task demonstrations into primitive sequences, followed by learning a high-level control policy to sequence primitives through imitation learning. Our experiments demonstrate that PRIME achieves a significant performance improvement in multi-stage manipulation tasks, with 10-34% higher success rates in simulation over state-of-the-art baselines and 20-48% on physical hardware.
