Table of Contents
Fetching ...

LiPS: Large-Scale Humanoid Robot Reinforcement Learning with Parallel-Series Structures

Qiang Zhang, Gang Han, Jingkai Sun, Wen Zhao, Jiahang Cao, Jiaxu Wang, Hao Cheng, Lingfeng Zhang, Yijie Guo, Renjing Xu

TL;DR

LiPS targets the sim2real gap in reinforcement learning for humanoid robots with complex parallel-series ankle structures by integrating multi-rigid-body dynamics into GPU-accelerated simulation to enable large-scale training. It develops a floating-base dynamics model for multi-rigid-body humanoids with parallel ankles and formulates a large-scale RL environment that trains in the parallel domain and maps outputs to series actuation via the transposed Jacobian $J^T$ during deployment. Experimental results show that LiPS-trained policies transfer effectively to the real Tien Kung humanoid, with lower computational load and improved robustness compared to traditional serial-training plus post-hoc conversion. The approach provides a generalizable, URDF-compatible framework for rapid, scalable RL of complex humanoids, potentially accelerating development of robust locomotion in real-world settings.

Abstract

In recent years, research on humanoid robots has garnered significant attention, particularly in reinforcement learning based control algorithms, which have achieved major breakthroughs. Compared to traditional model-based control algorithms, reinforcement learning based algorithms demonstrate substantial advantages in handling complex tasks. Leveraging the large-scale parallel computing capabilities of GPUs, contemporary humanoid robots can undergo extensive parallel training in simulated environments. A physical simulation platform capable of large-scale parallel training is crucial for the development of humanoid robots. As one of the most complex robot forms, humanoid robots typically possess intricate mechanical structures, encompassing numerous series and parallel mechanisms. However, many reinforcement learning based humanoid robot control algorithms currently employ open-loop topologies during training, deferring the conversion to series-parallel structures until the sim2real phase. This approach is primarily due to the limitations of physics engines, as current GPU-based physics engines often only support open-loop topologies or have limited capabilities in simulating multi-rigid-body closed-loop topologies. For enabling reinforcement learning-based humanoid robot control algorithms to train in large-scale parallel environments, we propose a novel training method LiPS. By incorporating multi-rigid-body dynamics modeling in the simulation environment, we significantly reduce the sim2real gap and the difficulty of converting to parallel structures during model deployment, thereby robustly supporting large-scale reinforcement learning for humanoid robots.

LiPS: Large-Scale Humanoid Robot Reinforcement Learning with Parallel-Series Structures

TL;DR

LiPS targets the sim2real gap in reinforcement learning for humanoid robots with complex parallel-series ankle structures by integrating multi-rigid-body dynamics into GPU-accelerated simulation to enable large-scale training. It develops a floating-base dynamics model for multi-rigid-body humanoids with parallel ankles and formulates a large-scale RL environment that trains in the parallel domain and maps outputs to series actuation via the transposed Jacobian during deployment. Experimental results show that LiPS-trained policies transfer effectively to the real Tien Kung humanoid, with lower computational load and improved robustness compared to traditional serial-training plus post-hoc conversion. The approach provides a generalizable, URDF-compatible framework for rapid, scalable RL of complex humanoids, potentially accelerating development of robust locomotion in real-world settings.

Abstract

In recent years, research on humanoid robots has garnered significant attention, particularly in reinforcement learning based control algorithms, which have achieved major breakthroughs. Compared to traditional model-based control algorithms, reinforcement learning based algorithms demonstrate substantial advantages in handling complex tasks. Leveraging the large-scale parallel computing capabilities of GPUs, contemporary humanoid robots can undergo extensive parallel training in simulated environments. A physical simulation platform capable of large-scale parallel training is crucial for the development of humanoid robots. As one of the most complex robot forms, humanoid robots typically possess intricate mechanical structures, encompassing numerous series and parallel mechanisms. However, many reinforcement learning based humanoid robot control algorithms currently employ open-loop topologies during training, deferring the conversion to series-parallel structures until the sim2real phase. This approach is primarily due to the limitations of physics engines, as current GPU-based physics engines often only support open-loop topologies or have limited capabilities in simulating multi-rigid-body closed-loop topologies. For enabling reinforcement learning-based humanoid robot control algorithms to train in large-scale parallel environments, we propose a novel training method LiPS. By incorporating multi-rigid-body dynamics modeling in the simulation environment, we significantly reduce the sim2real gap and the difficulty of converting to parallel structures during model deployment, thereby robustly supporting large-scale reinforcement learning for humanoid robots.

Paper Structure

This paper contains 12 sections, 17 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Four well-known humanoid robots all use parallel ankle mechanisms, but each has a different design approach: (a) Wukong uses a nearly decoupled parallel structure, (b) Tesla Optimus employs linear actuators for the parallel ankle design, while (c) Fourier GR-1 and (d) Tien Kung use rotary actuators but have different structural approaches at the ankle connection.
  • Figure 2: We referred to the description of current humanoid robot configurations in noreils2024humanoid and muhammad2010closed, similarly conducted a detailed analysis of the ankle structure.
  • Figure 3: Schematic Diagram of Ankle Dynamics Modeling
  • Figure 4: Illustration of LiPS Simulation Training and Real-World Deployment Process.
  • Figure 5: Joint velocities and positions of the two joints in the parallel structure of the ankle during the actual operation of the robot.
  • ...and 1 more figures