Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision
Hanbit Oh, Masaki Murooka, Tomohiro Motoda, Ryoichi Nakajo, Yukiyasu Domae
TL;DR
The paper addresses data efficiency and safety in robot imitation learning for clearance-limited manipulation by proposing Self-Augmented Robot Trajectory (SART), a framework that learns from a single human demonstration augmented by autonomous, collision-free trajectories within user-annotated precision spheres. SART comprises two stages: a one-shot teaching phase with annotated spheres around key waypoints and a self-augmentation phase where the robot samples poses on spherical surfaces and reconnects to the original demonstration to produce diverse training data. Through extensive sim-to-real experiments on peg-in-hole, door opening, lid opening, toolbox picking, bottle placing, and lid closing tasks, SART demonstrates substantially higher success rates than single-demo replay, behavioral cloning, and contact-free MILES, while reducing human data collection effort. The approach offers a safer, more data-efficient pathway for imitation learning in manipulation, with potential for integration with larger visuomotor models and future extension to contact-aware augmentation strategies.
Abstract
Imitation learning is a promising paradigm for training robot agents; however, standard approaches typically require substantial data acquisition -- via numerous demonstrations or random exploration -- to ensure reliable performance. Although exploration reduces human effort, it lacks safety guarantees and often results in frequent collisions -- particularly in clearance-limited tasks (e.g., peg-in-hole) -- thereby, necessitating manual environmental resets and imposing additional human burden. This study proposes Self-Augmented Robot Trajectory (SART), a framework that enables policy learning from a single human demonstration, while safely expanding the dataset through autonomous augmentation. SART consists of two stages: (1) human teaching only once, where a single demonstration is provided and precision boundaries -- represented as spheres around key waypoints -- are annotated, followed by one environment reset; (2) robot self-augmentation, where the robot generates diverse, collision-free trajectories within these boundaries and reconnects to the original demonstration. This design improves the data collection efficiency by minimizing human effort while ensuring safety. Extensive evaluations in simulation and real-world manipulation tasks show that SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations. Video results available at https://sites.google.com/view/sart-il .
