Table of Contents
Fetching ...

Manipulator as a Tail: Promoting Dynamic Stability for Legged Locomotion

Huang Huang, Antonio Loquercio, Ashish Kumar, Neerja Thakkar, Ken Goldberg, Jitendra Malik

TL;DR

This work investigates turning a legged robot’s arm into a tail-like contributor to dynamic stability and agile locomotion. It introduces a three-stage incremental reinforcement learning framework with behavior cloning as a proxy loss, enabling joint control of a quadruped and a three-DOF arm while gradually unlocking degrees of freedom. The approach yields substantial gains in robustness and agility, demonstrated through both simulation and real-robot experiments: higher success rates, reduced tracking errors, and improved disturbance rejection, including rapid high-speed turning. A first-principles analysis explains how dynamic arm motions, rather than static CoM shifts, drive improved angular dynamics, validating the observed behaviors. Overall, the work shows that a lightweight manipulator can meaningfully enhance legged locomotion, motivating future low-mass tail-like designs and more automated stage synthesis for complex, multi-limb systems.

Abstract

For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation. Since the system has 15 degrees of freedom (twelve for the legged robot and three for the arm), off-the-shelf reinforcement learning (RL) algorithms struggle to learn effective locomotion policies. Inspired by Bernstein's neurophysiological theory of animal motor learning, we develop an incremental training procedure that initially freezes some degrees of freedom and gradually releases them, using behaviour cloning (BC) from an early learning procedure to guide optimization in later learning. Simulation experiments show that our policy increases the success rate by up to 61 percentage points over the baselines. Simulation and real robot experiments suggest that our policy learns to use the arm as a tail to initiate robot turning at high speeds and to stabilize the quadruped under external perturbations. Quantitatively, in simulation experiments, we cut the failure rate up to 43.6% during high-speed turning and up to 31.8% for quadruped under external forces compared to using a locked arm.

Manipulator as a Tail: Promoting Dynamic Stability for Legged Locomotion

TL;DR

This work investigates turning a legged robot’s arm into a tail-like contributor to dynamic stability and agile locomotion. It introduces a three-stage incremental reinforcement learning framework with behavior cloning as a proxy loss, enabling joint control of a quadruped and a three-DOF arm while gradually unlocking degrees of freedom. The approach yields substantial gains in robustness and agility, demonstrated through both simulation and real-robot experiments: higher success rates, reduced tracking errors, and improved disturbance rejection, including rapid high-speed turning. A first-principles analysis explains how dynamic arm motions, rather than static CoM shifts, drive improved angular dynamics, validating the observed behaviors. Overall, the work shows that a lightweight manipulator can meaningfully enhance legged locomotion, motivating future low-mass tail-like designs and more automated stage synthesis for complex, multi-limb systems.

Abstract

For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation. Since the system has 15 degrees of freedom (twelve for the legged robot and three for the arm), off-the-shelf reinforcement learning (RL) algorithms struggle to learn effective locomotion policies. Inspired by Bernstein's neurophysiological theory of animal motor learning, we develop an incremental training procedure that initially freezes some degrees of freedom and gradually releases them, using behaviour cloning (BC) from an early learning procedure to guide optimization in later learning. Simulation experiments show that our policy increases the success rate by up to 61 percentage points over the baselines. Simulation and real robot experiments suggest that our policy learns to use the arm as a tail to initiate robot turning at high speeds and to stabilize the quadruped under external perturbations. Quantitatively, in simulation experiments, we cut the failure rate up to 43.6% during high-speed turning and up to 31.8% for quadruped under external forces compared to using a locked arm.
Paper Structure (11 sections, 4 equations, 6 figures, 3 tables)

This paper contains 11 sections, 4 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A learned whole-body control policy increases a legged robot's stability and agility. We evaluate the dynamic benefits of the arm in two scenarios. Top: stabilization under dynamic external forces. The actuated arm responds to an impulse force by swinging quickly away from the impulse direction to generate a balancing impulse force. As shown in section \ref{['sec:math']}, the arm relies on such dynamic forces instead of the static CoM changes to stablize. Bottom: dynamic agile locomotion, where the arm helps the quadruped turn left at high speeds. Before and during the turn, the arm moves toward the turning direction to increase the centripetal acceleration (toward the turning direction), similar to behaviors observed in animals. After the turn, the arm moves back to the nominal position. Videos and supplementary materials are here: https://tinyurl.com/2p8edezu
  • Figure 2: We learn whole-body control policies for quadruped locomotion with an arm in three stages, each learning a different skill of increasing complexity. Policies at each stage observe $\mathcal{O}_{quadruped}$ or $\mathcal{O}_{arm}$ and an extrinsic vector $z$ with compressed information of privileged observations $\mathcal{O}_{privileged}$, which consists of mass, friction, quadruped base velocity and external perturbations.
  • Figure 3: Simulation results and analysis of the benefit of arm for agile locomotion.
  • Figure 4: Physical Experiments under different scenarios. Our system, only trained on fractal terrains with external pushes, is successfully deployed on different real-world scenarios. The arm is at different configurations under different scenarios.
  • Figure 5: Robot stabilization under external perturbations. The robot with a locked arm fails to maintain balance. For actuated arm, instead of moving the arm to its left for changing the CoM, as shown in section \ref{['sec:math']}, the arm relies on the dynamic torques to mitigate the impulse perturbation, resulting the arm moving to its right.
  • ...and 1 more figures