Learning Agile Bipedal Motions on a Quadrupedal Robot

Yunfei Li; Jinhan Li; Wei Fu; Yi Wu

Learning Agile Bipedal Motions on a Quadrupedal Robot

Yunfei Li, Jinhan Li, Wei Fu, Yi Wu

TL;DR

Problem: enable agile, human-like bipedal motions on a lightweight quadruped. Approach: formulate the task as an MDP $(\mathcal{S},\mathcal{A},\mathcal{T},r,\gamma,\rho_0)$ and optimize with PPO, training a motion-conditioned policy to balance on hind toes while tracking base and front-limb targets, plus a high-level motion generator that converts videos or language into motion targets; the policy is trained in a calibrated simulator and transferred to real via real-to-sim calibration. Contributions: (i) a two-level framework enabling bipedal maneuvers on a consumer quad platform without external supports; (ii) real-to-sim calibration and domain randomization to bridge sim-to-real gaps; (iii) multiple human-interaction modalities including video mimicry, natural-language instructions, and physical guidance; (iv) demonstrations on a Xiaomi CyberDog2 achieving stand-up, walking at $v_x=\pm0.3$ m/s, and interactive behavior. Significance: provides a cost-effective path to humanoid-like agility on quad platforms and expands human-robot interaction capabilities for assistive and collaborative tasks.

Abstract

Can a quadrupedal robot perform bipedal motions like humans? Although developing human-like behaviors is more often studied on costly bipedal robot platforms, we present a solution over a lightweight quadrupedal robot that unlocks the agility of the quadruped in an upright standing pose and is capable of a variety of human-like motions. Our framework is with a hierarchical structure. At the low level is a motion-conditioned control policy that allows the quadrupedal robot to track desired base and front limb movements while balancing on two hind feet. The policy is commanded by a high-level motion generator that gives trajectories of parameterized human-like motions to the robot from multiple modalities of human input. We for the first time demonstrate various bipedal motions on a quadrupedal robot, and showcase interesting human-robot interaction modes including mimicking human videos, following natural language instructions, and physical interaction. The video is available at https://sites.google.com/view/bipedal-motions-quadruped.

Learning Agile Bipedal Motions on a Quadrupedal Robot

TL;DR

Problem: enable agile, human-like bipedal motions on a lightweight quadruped. Approach: formulate the task as an MDP

and optimize with PPO, training a motion-conditioned policy to balance on hind toes while tracking base and front-limb targets, plus a high-level motion generator that converts videos or language into motion targets; the policy is trained in a calibrated simulator and transferred to real via real-to-sim calibration. Contributions: (i) a two-level framework enabling bipedal maneuvers on a consumer quad platform without external supports; (ii) real-to-sim calibration and domain randomization to bridge sim-to-real gaps; (iii) multiple human-interaction modalities including video mimicry, natural-language instructions, and physical guidance; (iv) demonstrations on a Xiaomi CyberDog2 achieving stand-up, walking at

m/s, and interactive behavior. Significance: provides a cost-effective path to humanoid-like agility on quad platforms and expands human-robot interaction capabilities for assistive and collaborative tasks.

Abstract

Paper Structure (12 sections, 1 equation, 10 figures, 2 tables)

This paper contains 12 sections, 1 equation, 10 figures, 2 tables.

Introduction
Related Work
Preliminary
Method
Learning a motion-conditioned policy with RL
Sim-to-real transfer via real-to-sim calibration
Generating reference motions for human-like maneuvers
Experiments
Performance of the motion-conditioned policy
Performing human-like motions when combined with generated motion targets
Ablation studies
Conclusion

Figures (10)

Figure 1: A quadrupedal robot demonstrates human-like motions with only hind feet on the ground. The top row shows the reference human boxing video. The bottom row shows the robot mimicking the human motion to perform multiple punches and uppercuts at a high speed.
Figure 2: The workflow of generating reference motions from human language instructions with an LLM. The language command from the user is first decomposed into a sequence of motion descriptions, then converted to targets consisting of base velocity, heading, and front limb joint positions. The example outputs by the LLM in both steps are in grey boxes, and the prompts we use are in cyan boxes.
Figure 3: Our RL policy brings up the quadrupedal robot from a lying pose to a stabilized bipedal standing pose. The separate sit-down policy then controls the robot from the upright standing pose to settle down with four legs on the ground. The learned policies demonstrate great agility, using less than 1 second to stand up and sit down, while are sufficiently robust to work on the real robot.
Figure 4: The quadrupedal robot demonstrates bipedal locomotion following target linear velocities $v_x=\pm0.3\textrm{m/s}$ to walk forward or backward, and tracking target heading directions 90 degrees to the left or 60 degrees to the right.
Figure 5: The visualization of front-toe tracking during segments of "wave hand" (left) and "ballet dance" (right) in simulation. Blue dots represent the desired positions and the orange dots represent the achieved positions. The varying hues indicate the progression of time.
...and 5 more figures

Learning Agile Bipedal Motions on a Quadrupedal Robot

TL;DR

Abstract

Learning Agile Bipedal Motions on a Quadrupedal Robot

Authors

TL;DR

Abstract

Table of Contents

Figures (10)