Variable Stiffness for Robust Locomotion through Reinforcement Learning
Dario Spoljaric, Yashuai Yan, Dongheui Lee
TL;DR
The paper tackles the challenge of manual joint-stiffness tuning in RL-based quadruped locomotion by introducing variable stiffness directly into the action space with grouping strategies: per-joint stiffness (IJS), per-joint group stiffness (PJS), per-leg stiffness (PLS), and a hybrid joint-leg stiffness (HJLS). Using PPO with extensive domain randomization, the authors train policies that predict both target joint positions and stiffness, including a damping relation $K_t^d = 0.2 \sqrt{K_t^p}$. Empirical results show that PLS achieves the best velocity-tracking and push-recovery performance, while HJLS excels in energy efficiency; no per-joint stiffness policy outperforms the grouped approaches. Remarkably, policies trained on flat ground demonstrate robust sim-to-real transfer, enabling reliable outdoor walking and adaptability to payloads, which simplifies design by removing the need for manual stiffness tuning across tasks and terrains.
Abstract
Reinforcement-learned locomotion enables legged robots to perform highly dynamic motions but often accompanies time-consuming manual tuning of joint stiffness. This paper introduces a novel control paradigm that integrates variable stiffness into the action space alongside joint positions, enabling grouped stiffness control such as per-joint stiffness (PJS), per-leg stiffness (PLS) and hybrid joint-leg stiffness (HJLS). We show that variable stiffness policies, with grouping in per-leg stiffness (PLS), outperform position-based control in velocity tracking and push recovery. In contrast, HJLS excels in energy efficiency. Despite the fact that our policy is trained on flat floor only, our method showcases robust walking behaviour on diverse outdoor terrains, indicating robust sim-to-real transfer. Our approach simplifies design by eliminating per-joint stiffness tuning while keeping competitive results with various metrics.
