Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot
Neil Guan, Shangqun Yu, Shifan Zhu, Donghyun Kim
TL;DR
This work tackles the sim-to-real gap in RL-driven quadruped locomotion by introducing impedance matching, a frequency-domain synchronization framework that aligns simulated actuator dynamics with real-world behavior. Using chirp-based frequency-response analysis and Bode-plot alignment, the authors derive simulation gains and dynamics-randomization ranges that enable safe transfer, while preserving dynamic capabilities such as walking, running, and jumping. A multi-task RL policy is trained with Net2Net-inspired behavior expansion, evaluated on a Mini Cheetah Vision platform, and demonstrates stable omnidirectional running and substantial forward, backward, and lateral jumps, achieving up to 55 cm gaps and 38 cm heights (with a center-of-mass jump distance approaching 96 cm). The approach reduces the sim-to-real gap, provides principled guidelines for domain randomization, and delivers a real-world policy that performs near hardware-limited performance, signaling practical impact for dynamic legged robotics.
Abstract
Replicating the remarkable athleticism seen in animals has long been a challenge in robotics control. Although Reinforcement Learning (RL) has demonstrated significant progress in dynamic legged locomotion control, the substantial sim-to-real gap often hinders the real-world demonstration of truly dynamic movements. We propose a new framework to mitigate this gap through frequency-domain analysis-based impedance matching between simulated and real robots. Our framework offers a structured guideline for parameter selection and the range for dynamics randomization in simulation, thus facilitating a safe sim-to-real transfer. The learned policy using our framework enabled jumps across distances of 55 cm and heights of 38 cm. The results are, to the best of our knowledge, one of the highest and longest running jumps demonstrated by an RL-based control policy in a real quadruped robot. Note that the achieved jumping height is approximately 85% of that obtained from a state-of-the-art trajectory optimization method, which can be seen as the physical limit for the given robot hardware. In addition, our control policy accomplished stable walking at speeds up to 2 m/s in the forward and backward directions, and 1 m/s in the sideway direction.
