Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot

Letian Qian; Yuhang Wan; Shuhan Wang; Xin Luo

Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot

Letian Qian, Yuhang Wan, Shuhan Wang, Xin Luo

Abstract

Electrically-actuated quadrupedal robots possess high mobility on complex terrains, but their motors tend to accumulate heat under high-torque cyclic loads, potentially triggering overheat protection and limiting long-duration tasks. This work proposes a thermal-aware control method that incorporates motor temperatures into reinforcement learning locomotion policies and introduces thermal-constraint rewards to prevent temperature exceedance. Real-world experiments on the Unitree A1 demonstrate that, under a fixed 3 kg payload, the baseline policy triggers overheat protection and stops within approximately 7 minutes, whereas the proposed method can operate continuously for over 27 minutes without thermal interruptions while maintaining comparable command-tracking performance, thereby enhancing sustainable operational capability.

Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot

Abstract

Paper Structure (13 sections, 8 equations, 9 figures, 3 tables)

This paper contains 13 sections, 8 equations, 9 figures, 3 tables.

Introduction
Motor Thermal Model for Quadruped Robots
Single-Motor Thermal Model
Whole-body Thermal Model of the Quadruped Robot
Thermal-Aware Locomotion Policy Design
Training Framework
Temperature Randomization
Reward Function
EXPERIMENTS AND RESULTS
Experimental Setup
Real-world Locomotion Results
Analysis of Thermal-Aware Locomotion Performance
CONCLUSION

Figures (9)

Figure 1: Single-Motor Thermal Model.
Figure 2: Whole-body thermal model of the quadruped robot.
Figure 3: Overview of the proposed training framework. During training, all observations are obtained from the simulator. Among them, the output frequency of all networks and the thermal model are both 50 Hz, while the PD controller outputs torques to the actuators at 200 Hz within the simulation.
Figure 4: Comparison between baseline and proposed policy for continuous locomotion.
Figure 5: Real-world experimental results. In the upper figure, the dashed line indicates the motor temperature threshold. Due to outdoor factors, it is not possible to ensure the same initial temperature in these experiments. The lower figure shows the robot’s base velocity under joystick control, which was estimated by the encoder network illustrated in Fig. \ref{['fig:figure3']}.
...and 4 more figures

Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot

Abstract

Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot

Authors

Abstract

Table of Contents

Figures (9)