Table of Contents
Fetching ...

Adaptive Tracking of a Single-Rigid-Body Character in Various Environments

Taesoo Kwon, Taehong Gu, Jaewon Ahn, Yoonsang Lee

TL;DR

The paper addresses the challenge of enabling robust locomotion adaptation for simulated characters across unseen environments. It introduces a centroidal-dynamics single-rigid-body (SRB) model and trains a reinforcement learning policy to track a reference motion, achieving rapid, sample-efficient learning. At runtime, the SRB motion is transformed into plausible full-body motion via a precomputed delta and momentum-mapped inverse kinematics, allowing policy switching and blending without additional learning. The approach delivers competitive adaptability to uneven terrain and external pushes with significantly reduced training time compared to full-body DRL methods like DeepMimic, highlighting a practical pathway for efficient, environment-agnostic character control.

Abstract

Since the introduction of DeepMimic [Peng et al. 2018], subsequent research has focused on expanding the repertoire of simulated motions across various scenarios. In this study, we propose an alternative approach for this goal, a deep reinforcement learning method based on the simulation of a single-rigid-body character. Using the centroidal dynamics model (CDM) to express the full-body character as a single rigid body (SRB) and training a policy to track a reference motion, we can obtain a policy that is capable of adapting to various unobserved environmental changes and controller transitions without requiring any additional learning. Due to the reduced dimension of state and action space, the learning process is sample-efficient. The final full-body motion is kinematically generated in a physically plausible way, based on the state of the simulated SRB character. The SRB simulation is formulated as a quadratic programming (QP) problem, and the policy outputs an action that allows the SRB character to follow the reference motion. We demonstrate that our policy, efficiently trained within 30 minutes on an ultraportable laptop, has the ability to cope with environments that have not been experienced during learning, such as running on uneven terrain or pushing a box, and transitions between learned policies, without any additional learning.

Adaptive Tracking of a Single-Rigid-Body Character in Various Environments

TL;DR

The paper addresses the challenge of enabling robust locomotion adaptation for simulated characters across unseen environments. It introduces a centroidal-dynamics single-rigid-body (SRB) model and trains a reinforcement learning policy to track a reference motion, achieving rapid, sample-efficient learning. At runtime, the SRB motion is transformed into plausible full-body motion via a precomputed delta and momentum-mapped inverse kinematics, allowing policy switching and blending without additional learning. The approach delivers competitive adaptability to uneven terrain and external pushes with significantly reduced training time compared to full-body DRL methods like DeepMimic, highlighting a practical pathway for efficient, environment-agnostic character control.

Abstract

Since the introduction of DeepMimic [Peng et al. 2018], subsequent research has focused on expanding the repertoire of simulated motions across various scenarios. In this study, we propose an alternative approach for this goal, a deep reinforcement learning method based on the simulation of a single-rigid-body character. Using the centroidal dynamics model (CDM) to express the full-body character as a single rigid body (SRB) and training a policy to track a reference motion, we can obtain a policy that is capable of adapting to various unobserved environmental changes and controller transitions without requiring any additional learning. Due to the reduced dimension of state and action space, the learning process is sample-efficient. The final full-body motion is kinematically generated in a physically plausible way, based on the state of the simulated SRB character. The SRB simulation is formulated as a quadratic programming (QP) problem, and the policy outputs an action that allows the SRB character to follow the reference motion. We demonstrate that our policy, efficiently trained within 30 minutes on an ultraportable laptop, has the ability to cope with environments that have not been experienced during learning, such as running on uneven terrain or pushing a box, and transitions between learned policies, without any additional learning.
Paper Structure (19 sections, 12 equations, 10 figures)

This paper contains 19 sections, 12 equations, 10 figures.

Figures (10)

  • Figure 1: System overview.
  • Figure 2: SRB character
  • Figure 3: Pushing a box of varying weights.
  • Figure 4: Plots for external pushes. Dotted line: the smallest external force at which the character always loses balance. Solid line: the largest external force at which the character maintains balance at each phase without falling. Circles: the ratio of successful balance maintenance (no falling within 20 seconds after the push) out of 10 trials at points where both balance maintenance and loss occur, with its size representing the proportion of successful balance maintenance.
  • Figure 5: Conceptual depiction of aligning COM trajectory and calculating COM frame delta. The frames on the plane are the center of each foot and blue and red balls are its heel and toe contact points.
  • ...and 5 more figures