Table of Contents
Fetching ...

Adaptive Control Strategy for Quadruped Robots in Actuator Degradation Scenarios

Xinyuan Wu, Wentao Dong, Hang Lai, Yong Yu, Ying Wen

TL;DR

This work tackles actuator degradation faults in quadruped locomotion by introducing ADAPT, a teacher–student framework that combines reinforcement learning with a transformer-based policy to sustain locomotion using only onboard sensors. Twelve teacher policies, each conditioned on a degraded joint, are trained in simulation and distilled into a single transformer student to enable zero-shot transfer to real robots. The approach models a continuous per-joint degradation rate and employs offline behavior cloning to train the student, achieving robust performance across fault scenarios. Empirical results on the Unitree A1 demonstrate strong fault tolerance, close to teacher performance, and successful sim-to-real transfer without online fault-specific tuning, highlighting practical impact for resilient legged robots in challenging environments.

Abstract

Quadruped robots have strong adaptability to extreme environments but may also experience faults. Once these faults occur, robots must be repaired before returning to the task, reducing their practical feasibility. One prevalent concern among these faults is actuator degradation, stemming from factors like device aging or unexpected operational events. Traditionally, addressing this problem has relied heavily on intricate fault-tolerant design, which demands deep domain expertise from developers and lacks generalizability. Learning-based approaches offer effective ways to mitigate these limitations, but a research gap exists in effectively deploying such methods on real-world quadruped robots. This paper introduces a pioneering teacher-student framework rooted in reinforcement learning, named Actuator Degradation Adaptation Transformer (ADAPT), aimed at addressing this research gap. This framework produces a unified control strategy, enabling the robot to sustain its locomotion and perform tasks despite sudden joint actuator faults, relying exclusively on its internal sensors. Empirical evaluations on the Unitree A1 platform validate the deployability and effectiveness of Adapt on real-world quadruped robots, and affirm the robustness and practicality of our approach.

Adaptive Control Strategy for Quadruped Robots in Actuator Degradation Scenarios

TL;DR

This work tackles actuator degradation faults in quadruped locomotion by introducing ADAPT, a teacher–student framework that combines reinforcement learning with a transformer-based policy to sustain locomotion using only onboard sensors. Twelve teacher policies, each conditioned on a degraded joint, are trained in simulation and distilled into a single transformer student to enable zero-shot transfer to real robots. The approach models a continuous per-joint degradation rate and employs offline behavior cloning to train the student, achieving robust performance across fault scenarios. Empirical results on the Unitree A1 demonstrate strong fault tolerance, close to teacher performance, and successful sim-to-real transfer without online fault-specific tuning, highlighting practical impact for resilient legged robots in challenging environments.

Abstract

Quadruped robots have strong adaptability to extreme environments but may also experience faults. Once these faults occur, robots must be repaired before returning to the task, reducing their practical feasibility. One prevalent concern among these faults is actuator degradation, stemming from factors like device aging or unexpected operational events. Traditionally, addressing this problem has relied heavily on intricate fault-tolerant design, which demands deep domain expertise from developers and lacks generalizability. Learning-based approaches offer effective ways to mitigate these limitations, but a research gap exists in effectively deploying such methods on real-world quadruped robots. This paper introduces a pioneering teacher-student framework rooted in reinforcement learning, named Actuator Degradation Adaptation Transformer (ADAPT), aimed at addressing this research gap. This framework produces a unified control strategy, enabling the robot to sustain its locomotion and perform tasks despite sudden joint actuator faults, relying exclusively on its internal sensors. Empirical evaluations on the Unitree A1 platform validate the deployability and effectiveness of Adapt on real-world quadruped robots, and affirm the robustness and practicality of our approach.
Paper Structure (29 sections, 11 equations, 13 figures, 2 tables)

This paper contains 29 sections, 11 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Overall framework of Adapt. Adapt starts by training 12 teacher policies separately in simulation. These teacher policies are then used to generate trajectories, which are subsequently utilized to distill a unified transformer-based student policy. Afterward, the student policy is poised for zero-shot deployment to real-world robots.
  • Figure 2: Policy performance over different fault settings (joint, actuator degradation rate). The horizontal axis signifies the actuator degradation rate, with 1 indicating complete damage, while the vertical axis corresponds to the specific joint affected. Each grid in the figure represents the accumulative rewards averaged over 1024 runs in parallel simulation for that specific scenario. Each run randomly sampled the initial state, command speed, and fault occurrence time within specified ranges. The reward recording stopped if the robot fell or 1000 timesteps were collected.
  • Figure 3: Illustration of Adaptive gait. Top: The robot's gait changes in the simulator, with ground-contact feet in blue dots. Middle: The L2-Norm of contact force on different feet. Bottom: feet contact time with the ground; darker color represents larger ground contact force.
  • Figure 4: The projection of gravity components on the robot's plane during robot locomotion, with the robot's forward direction denoted as the positive y-axis and facing right as the positive x-axis. Different colors reflect changes in gravitational force components over time.
  • Figure 5: Real-world deployment results of three control models for actuator degradation, with damage restricted to the LFC joint. Images have been mirrored for visual consistency.
  • ...and 8 more figures