Table of Contents
Fetching ...

MaskAdapt: Learning Flexible Motion Adaptation via Mask-Invariant Prior for Physics-Based Characters

Soomin Park, Eunseong Lee, Kwang Bin Lee, Sung-Hee Lee

Abstract

We present MaskAdapt, a framework for flexible motion adaptation in physics-based humanoid control. The framework follows a two-stage residual learning paradigm. In the first stage, we train a mask-invariant base policy using stochastic body-part masking and a regularization term that enforces consistent action distributions across masking conditions. This yields a robust motion prior that remains stable under missing observations, anticipating later adaptation in those regions. In the second stage, a residual policy is trained atop the frozen base controller to modify only the targeted body parts while preserving the original behaviors elsewhere. We demonstrate the versatility of this design through two applications: (i) motion composition, where varying masks enable multi-part adaptation within a single sequence, and (ii) text-driven partial goal tracking, where designated body parts follow kinematic targets provided by a pre-trained text-conditioned autoregressive motion generator. Through experiments, MaskAdapt demonstrates strong robustness and adaptability, producing diverse behaviors under masked observations and delivering superior targeted motion adaptation compared to prior work.

MaskAdapt: Learning Flexible Motion Adaptation via Mask-Invariant Prior for Physics-Based Characters

Abstract

We present MaskAdapt, a framework for flexible motion adaptation in physics-based humanoid control. The framework follows a two-stage residual learning paradigm. In the first stage, we train a mask-invariant base policy using stochastic body-part masking and a regularization term that enforces consistent action distributions across masking conditions. This yields a robust motion prior that remains stable under missing observations, anticipating later adaptation in those regions. In the second stage, a residual policy is trained atop the frozen base controller to modify only the targeted body parts while preserving the original behaviors elsewhere. We demonstrate the versatility of this design through two applications: (i) motion composition, where varying masks enable multi-part adaptation within a single sequence, and (ii) text-driven partial goal tracking, where designated body parts follow kinematic targets provided by a pre-trained text-conditioned autoregressive motion generator. Through experiments, MaskAdapt demonstrates strong robustness and adaptability, producing diverse behaviors under masked observations and delivering superior targeted motion adaptation compared to prior work.

Paper Structure

This paper contains 31 sections, 14 equations, 10 figures, 11 tables, 2 algorithms.

Figures (10)

  • Figure 1: Frame visitation frequencies for locomotion with both legs masked. The policy without MI loss (blue) shows a skewed visitation pattern, whereas applying the MI loss (yellow) produces a markedly more uniform distribution. Our mask-invariant base achieves a level of uniformity comparable to AMP (green), indicating that the MI loss preserves diverse state visitation even under heavy masking.
  • Figure 2: Overview of the base policy training process. At each step, a mask $m_t$ specifies which body parts are hidden from observation. The full state $s_t$ is masked to form $\bar{s}_t$, which is passed to the base policy $\pi$ to generate the action $a_t$. The mask-invariant loss enforces consistency between masked and unmasked action distributions, promoting robustness to partial observations.
  • Figure 2: Qualitative comparison of composition methods, each motion generated from the same initial state. In (a), the kick motion in CML is delayed; in (b), CML exhibits severe artifacts and fails to complete full arm rotations; and in (c), CML produces limited motion combinations compared to ours.
  • Figure 3: Overview of training the residual policy for motion composition. At each step, a mask $m_t$ specifies which body parts are subject to adaptation, and $\psi$ learns to generate residual actions for those regions while maintaining stability elsewhere.
  • Figure 3: Qualitative comparison of our method, CML, and a from-scratch model for the partial tracking task given the text commands on the left. The red region highlights deviations from the base behavior.
  • ...and 5 more figures