SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning
Taewook Nam, Sung Ju Hwang
TL;DR
The paper tackles slow execution of pre-trained robotic policies, which often rely on costly demonstrations for faster behavior. It introduces SpeedAug, a two-stage framework that first pre-trains a diffusion-based policy on speed-augmented demonstrations to encode diverse tempos, then fine-tunes it with RL (DPPO) to converge on safe, fast execution. The key contributions are the tempo-enriched pre-training of a multimodal diffusion policy, and the RL fine-tuning strategy that yields substantial improvements in sample efficiency while preserving high task success. Empirical results on Robosuite and Kitchen tasks demonstrate significant speedups with fewer online samples, highlighting practical benefits for accelerated robotic manipulation.
Abstract
Recent advances in robotic policy learning have enabled complex manipulation in real-world environments, yet the execution speed of these policies often lags behind hardware capabilities due to the cost of collecting faster demonstrations. Existing works on policy acceleration reinterpret action sequence for unseen execution speed, thereby encountering distributional shifts from the original demonstrations. Reinforcement learning is a promising approach that adapts policies for faster execution without additional demonstration, but its unguided exploration is sample inefficient. We propose SpeedAug, an RL-based policy acceleration framework that efficiently adapts pre-trained policies for faster task execution. SpeedAug constructs behavior prior that encompasses diverse tempos of task execution by pre-training a policy on speed-augmented demonstrations. Empirical results on robotic manipulation benchmarks show that RL fine-tuning initialized from this tempo-enriched policy significantly improves the sample efficiency of existing RL and policy acceleration methods while maintaining high success rate.
