Table of Contents
Fetching ...

Walk Like Dogs: Learning Steerable Imitation Controllers for Legged Robots from Unlabeled Motion Data

Dongho Kang, Jin Cheng, Fatemeh Zargarbashi, Taerim Yoon, Sungjoon Choi, Stelian Coros

TL;DR

An imitation learning framework that extracts distinctive legged locomotion behaviors and transitions between them from unlabeled real-world motion data by automatically discovering behavioral modes and mapping user steering commands to them, which enables user-steerable and stylistically consistent motion imitation.

Abstract

We present an imitation learning framework that extracts distinctive legged locomotion behaviors and transitions between them from unlabeled real-world motion data. By automatically discovering behavioral modes and mapping user steering commands to them, the framework enables user-steerable and stylistically consistent motion imitation. Our approach first bridges the morphological and physical gap between the motion source and the robot by transforming raw data into a physically consistent, robot-compatible dataset using a kino-dynamic motion retargeting strategy. This data is used to train a steerable motion synthesis module that generates stylistic, multi-modal kinematic targets from high-level user commands. These targets serve as a reference for a reinforcement learning controller, which reliably executes them on the robot hardware. In our experiments, a controller trained on dog motion data demonstrated distinctive quadrupedal gait patterns and emergent gait transitions in response to varying velocity commands. These behaviors were achieved without manual labeling, predefined mode counts, or explicit switching rules, maintaining the stylistic coherence of the data.

Walk Like Dogs: Learning Steerable Imitation Controllers for Legged Robots from Unlabeled Motion Data

TL;DR

An imitation learning framework that extracts distinctive legged locomotion behaviors and transitions between them from unlabeled real-world motion data by automatically discovering behavioral modes and mapping user steering commands to them, which enables user-steerable and stylistically consistent motion imitation.

Abstract

We present an imitation learning framework that extracts distinctive legged locomotion behaviors and transitions between them from unlabeled real-world motion data. By automatically discovering behavioral modes and mapping user steering commands to them, the framework enables user-steerable and stylistically consistent motion imitation. Our approach first bridges the morphological and physical gap between the motion source and the robot by transforming raw data into a physically consistent, robot-compatible dataset using a kino-dynamic motion retargeting strategy. This data is used to train a steerable motion synthesis module that generates stylistic, multi-modal kinematic targets from high-level user commands. These targets serve as a reference for a reinforcement learning controller, which reliably executes them on the robot hardware. In our experiments, a controller trained on dog motion data demonstrated distinctive quadrupedal gait patterns and emergent gait transitions in response to varying velocity commands. These behaviors were achieved without manual labeling, predefined mode counts, or explicit switching rules, maintaining the stylistic coherence of the data.

Paper Structure

This paper contains 18 sections, 11 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Unitree Go2 robot navigating freely across a grass field in response to joystick commands (top). The gait pattern automatically transitions from Pace to Trot as the forward speed command increases from 0.6m/s to 1.0m/s (bottom).
  • Figure 2: Overview of the framework. An animal motion DB is first transformed into a robot motion DB using kino-dynamic motion retargeting (in blue). Next, each state transition in the motion DB is embedded into a latent space using a VAE (in purple). The trained decoder, combined with an RL-based motion synthesis policy produces a new reference motion in response to steering commands (in green). Finally, the reference motion is tracked by an RL controller (in orange).
  • Figure 3: (a) Limb penetration and contact foot slips introduced by UVM. (b) Our kino-dynamic MR removes these artifacts, and ensure both kinematic and dynamic feasibility.
  • Figure 4: (a) Our kino-dynamic MR enables reliable transfer of dog motion sequences to the real-world robot. We evaluate its effectiveness against the UVM baseline by comparing (b) kinematic artifacts in the retargeted motions and (c) the resulting RL training curves for downstream imitation tasks, where RL policies are trained to execute the motions.
  • Figure 5: Snapshots of physically simulated Go2 (top) executing a motion sequence generated by our motion synthesis module in response to varying forward speed commands shown alongside the speed profile (middle) and leg swing timeline (bottom).
  • ...and 1 more figures