Learning and Deploying Robust Locomotion Policies with Minimal Dynamics Randomization
Luigi Campanaro, Siddhant Gangapurwala, Wolfgang Merkt, Ioannis Havoutis
TL;DR
The paper presents Extended Random Force Injection (ERFI) as a minimal, parameter-efficient alternative to dynamics randomization for training robust quadrupedal locomotion policies in simulation. By combining random perturbations to joint torques with episodic actuation offsets (ERFI-C) or pure RFI variants (including ERFI-50), the approach captures both local and global dynamics variations, enabling effective sim-to-real transfer without extensive system identification. Empirical results show that ERFI-based policies outperform standard baselines and broad-domain randomization, with up to ~53% improved robustness to mass variations and ~61% when a payload arm is added, validated on ANYmal C and Unitree A1 across flat and uneven terrains. The work demonstrates practical hardware deployment and suggests ERFI as a lightweight, effective alternative to actuator networks and heavy randomization for real-world legged locomotion.
Abstract
Training deep reinforcement learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behaviour. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, exhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.
