Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Nozomu Masuya, Hiroshi Sato, Koki Yamane, Takuya Kusume, Sho Sakaino, Toshiaki Tsuji
TL;DR
This work tackles data scarcity in force-controlled imitation learning for manipulation by introducing real-world data augmentation through teaching--playback at variable speeds. By leveraging a four-channel bilateral control framework and a motion-copying system, the authors interpolate playback speeds and reuse only successful trials to train an NN in a bilateral control–based imitation-learning setting. Experimental results on pick-and-place and wiping tasks show up to a $55\%$ increase in success rate and improved alignment with commanded duration/frequency, compared to simple speed changes. The method preserves real-world environmental interactions and offers a practical pathway to robust variable-speed, contact-rich manipulation, with potential for combining with simulation-based augmentation and self-supervised learning for broader applicability.
Abstract
Because imitation learning relies on human demonstrations in hard-to-simulate settings, the inclusion of force control in this method has resulted in a shortage of training data, even with a simple change in speed. Although the field of data augmentation has addressed the lack of data, conventional methods of data augmentation for robot manipulation are limited to simulation-based methods or downsampling for position control. This paper proposes a novel method of data augmentation that is applicable to force control and preserves the advantages of real-world datasets. We applied teaching-playback at variable speeds as real-world data augmentation to increase both the quantity and quality of environmental reactions at variable speeds. An experiment was conducted on bilateral control-based imitation learning using a method of imitation learning equipped with position-force control. We evaluated the effect of real-world data augmentation on two tasks, pick-and-place and wiping, at variable speeds, each from two human demonstrations at fixed speed. The results showed a maximum 55% increase in success rate from a simple change in speed of real-world reactions and improved accuracy along the duration/frequency command by gathering environmental reactions at variable speeds.
