Manipulator as a Tail: Promoting Dynamic Stability for Legged Locomotion
Huang Huang, Antonio Loquercio, Ashish Kumar, Neerja Thakkar, Ken Goldberg, Jitendra Malik
TL;DR
This work investigates turning a legged robot’s arm into a tail-like contributor to dynamic stability and agile locomotion. It introduces a three-stage incremental reinforcement learning framework with behavior cloning as a proxy loss, enabling joint control of a quadruped and a three-DOF arm while gradually unlocking degrees of freedom. The approach yields substantial gains in robustness and agility, demonstrated through both simulation and real-robot experiments: higher success rates, reduced tracking errors, and improved disturbance rejection, including rapid high-speed turning. A first-principles analysis explains how dynamic arm motions, rather than static CoM shifts, drive improved angular dynamics, validating the observed behaviors. Overall, the work shows that a lightweight manipulator can meaningfully enhance legged locomotion, motivating future low-mass tail-like designs and more automated stage synthesis for complex, multi-limb systems.
Abstract
For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation. Since the system has 15 degrees of freedom (twelve for the legged robot and three for the arm), off-the-shelf reinforcement learning (RL) algorithms struggle to learn effective locomotion policies. Inspired by Bernstein's neurophysiological theory of animal motor learning, we develop an incremental training procedure that initially freezes some degrees of freedom and gradually releases them, using behaviour cloning (BC) from an early learning procedure to guide optimization in later learning. Simulation experiments show that our policy increases the success rate by up to 61 percentage points over the baselines. Simulation and real robot experiments suggest that our policy learns to use the arm as a tail to initiate robot turning at high speeds and to stabilize the quadruped under external perturbations. Quantitatively, in simulation experiments, we cut the failure rate up to 43.6% during high-speed turning and up to 31.8% for quadruped under external forces compared to using a locked arm.
