Actuator-Constrained Reinforcement Learning for High-Speed Quadrupedal Locomotion

Young-Ha Shin; Tae-Gyu Song; Gwanghyeon Ji; Hae-Won Park

Actuator-Constrained Reinforcement Learning for High-Speed Quadrupedal Locomotion

Young-Ha Shin, Tae-Gyu Song, Gwanghyeon Ji, Hae-Won Park

Abstract

This paper presents a method for achieving high-speed running of a quadruped robot by considering the actuator torque-speed operating region in reinforcement learning. The physical properties and constraints of the actuator are included in the training process to reduce state transitions that are infeasible in the real world due to motor torque-speed limitations. The gait reward is designed to distribute motor torque evenly across all legs, contributing to more balanced power usage and mitigating performance bottlenecks due to single-motor saturation. Additionally, we designed a lightweight foot to enhance the robot's agility. We observed that applying the motor operating region as a constraint helps the policy network avoid infeasible areas during sampling. With the trained policy, KAIST Hound, a 45 kg quadruped robot, can run up to 6.5 m/s, which is the fastest speed among electric motor-based quadruped robots.

Actuator-Constrained Reinforcement Learning for High-Speed Quadrupedal Locomotion

Abstract

Paper Structure (20 sections, 4 equations, 12 figures, 2 tables)

This paper contains 20 sections, 4 equations, 12 figures, 2 tables.

Introduction
Methods
Motor operating region (MOR)
Transformation between joint space and motor space
Motor torque saturation
Gait reward
Learning framework descriptions
Overall structure
Reward functions
Strategies for high-speed running in simulation
Lightweight foot design
Results
Evaluation in simulation
System setup
Experimental results
...and 5 more sections

Figures (12)

Figure 1: KAIST Hound crosses a 1.5-meter distance between two lines on the treadmill in 0.229 seconds with an average speed 6.5 m/s. The snapshots are obtained from a video recorded with a rate of 240 frames per second.
Figure 2: Motor current versus motor torque of nonlinear ferromagnetic curve and dashed line for linear approximation.
Figure 3: (a) An operating region of BLDC motor with grey-colored based on the assumption that ignored the time derivative of the current. (b) Torque-velocity constraints defined by the URDF in joint space.
Figure 4: (a) The conceptual design of KAIST Hound actuator. (b) Superposition of the angular velocity of the KFE and HFE.
Figure 5: The overall proposed reinforcement learning framework is shown. The action was expressed in terms of motor torque, and the torque was saturated by the MOR in the simulation. The nonlinearity of the current and torque due to the ferromagnetic core is compensated when the network performs inference.
...and 7 more figures

Actuator-Constrained Reinforcement Learning for High-Speed Quadrupedal Locomotion

Abstract

Actuator-Constrained Reinforcement Learning for High-Speed Quadrupedal Locomotion

Authors

Abstract

Table of Contents

Figures (12)