Select before Act: Spatially Decoupled Action Repetition for Continuous Control
Buqing Nie, Yangqing Fu, Yue Gao
TL;DR
SDAR introduces a spatially decoupled action repetition framework for continuous control, enabling per-dimension act-or-repeat decisions in a two-stage policy. By decoupling selection and action, SDAR achieves flexible repetition strategies that balance persistence with diversity, improving sample efficiency and reducing action fluctuations relative to existing closed-loop and open-loop repetition methods. The approach is validated across classic control, locomotion, and manipulation tasks, showing higher performance and smoother control while maintaining computational practicality via selective sampling. This work advances temporally extended decision-making in RL by tailoring repetition to individual actuators and paves the way for incorporating inter-dimensional correlations in future extensions.
Abstract
Reinforcement Learning (RL) has achieved remarkable success in various continuous control tasks, such as robot manipulation and locomotion. Different to mainstream RL which makes decisions at individual steps, recent studies have incorporated action repetition into RL, achieving enhanced action persistence with improved sample efficiency and superior performance. However, existing methods treat all action dimensions as a whole during repetition, ignoring variations among them. This constraint leads to inflexibility in decisions, which reduces policy agility with inferior effectiveness. In this work, we propose a novel repetition framework called SDAR, which implements Spatially Decoupled Action Repetition through performing closed-loop act-or-repeat selection for each action dimension individually. SDAR achieves more flexible repetition strategies, leading to an improved balance between action persistence and diversity. Compared to existing repetition frameworks, SDAR is more sample efficient with higher policy performance and reduced action fluctuation. Experiments are conducted on various continuous control scenarios, demonstrating the effectiveness of spatially decoupled repetition design proposed in this work.
