Training Directional Locomotion for Quadrupedal Low-Cost Robotic Systems via Deep Reinforcement Learning
Peter Böhm, Archie C. Chapman, Pauline Pounds
TL;DR
This work demonstrates real-world deep reinforcement learning to achieve directional locomotion on a low-cost quadrupedal robot by randomizing the heading at episode resets to promote diverse action–state exploration. The authors introduce a GRU-based sequence encoder (R-TD3) and compare three heading-reset strategies, showing that normally distributed random resets enable robust performance on complex trajectories like figure eights. Training directly on commodity hardware reduces the sim-to-real gap and demonstrates that high-end simulators are not strictly necessary for learning versatile locomotion policies. The approach yields policies capable of following straight lines, circles, and intricate trajectories with minimal human intervention, highlighting practical benefits for scalable, inexpensive legged robots.
Abstract
In this work we present Deep Reinforcement Learning (DRL) training of directional locomotion for low-cost quadrupedal robots in the real world. In particular, we exploit randomization of heading that the robot must follow to foster exploration of action-state transitions most useful for learning both forward locomotion as well as course adjustments. Changing the heading in episode resets to current yaw plus a random value drawn from a normal distribution yields policies able to follow complex trajectories involving frequent turns in both directions as well as long straight-line stretches. By repeatedly changing the heading, this method keeps the robot moving within the training platform and thus reduces human involvement and need for manual resets during the training. Real world experiments on a custom-built, low-cost quadruped demonstrate the efficacy of our method with the robot successfully navigating all validation tests. When trained with other approaches, the robot only succeeds in forward locomotion test and fails when turning is required.
