Table of Contents
Fetching ...

Towards Embodied AI with MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale

Chengkun Li, Cheryl Wang, Bianca Ziliotto, Merkourios Simos, Jozsef Kovecses, Guillaume Durandau, Alexander Mathis

Abstract

Learning motor control for muscle-driven musculoskeletal models is hindered by the computational cost of biomechanically accurate simulation and the scarcity of validated, open full-body models. Here we present MuscleMimic, an open-source framework for scalable motion imitation learning with physiologically realistic, muscle-actuated humanoids. MuscleMimic provides two validated musculoskeletal embodiments - a fixed-root upper-body model (126 muscles) for bimanual manipulation and a full-body model (416 muscles) for locomotion - together with a retargeting pipeline that maps SMPL-format motion capture data onto musculoskeletal structures while preserving kinematic and dynamic consistency. Leveraging massively parallel GPU simulation, the framework achieves order-of-magnitude training speedups over prior CPU-based approaches while maintaining comprehensive collision handling, enabling a single generalist policy to be trained on hundreds of diverse motions within days. The resulting policy faithfully reproduces a broad repertoire of human movements under full muscular control and can be fine-tuned to novel motions within hours. Biomechanical validation against experimental walking and running data demonstrates strong agreement in joint kinematics (mean correlation r = 0.90), while muscle activation analysis reveals both the promise and fundamental challenges of achieving physiological fidelity through kinematic imitation alone. By lowering the computational and data barriers to musculoskeletal simulation, MuscleMimic enables systematic model validation across diverse dynamic movements and broader participation in neuromuscular control research. Code, models, checkpoints, and retargeted datasets are available at: https://github.com/amathislab/musclemimic

Towards Embodied AI with MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale

Abstract

Learning motor control for muscle-driven musculoskeletal models is hindered by the computational cost of biomechanically accurate simulation and the scarcity of validated, open full-body models. Here we present MuscleMimic, an open-source framework for scalable motion imitation learning with physiologically realistic, muscle-actuated humanoids. MuscleMimic provides two validated musculoskeletal embodiments - a fixed-root upper-body model (126 muscles) for bimanual manipulation and a full-body model (416 muscles) for locomotion - together with a retargeting pipeline that maps SMPL-format motion capture data onto musculoskeletal structures while preserving kinematic and dynamic consistency. Leveraging massively parallel GPU simulation, the framework achieves order-of-magnitude training speedups over prior CPU-based approaches while maintaining comprehensive collision handling, enabling a single generalist policy to be trained on hundreds of diverse motions within days. The resulting policy faithfully reproduces a broad repertoire of human movements under full muscular control and can be fine-tuned to novel motions within hours. Biomechanical validation against experimental walking and running data demonstrates strong agreement in joint kinematics (mean correlation r = 0.90), while muscle activation analysis reveals both the promise and fundamental challenges of achieving physiological fidelity through kinematic imitation alone. By lowering the computational and data barriers to musculoskeletal simulation, MuscleMimic enables systematic model validation across diverse dynamic movements and broader participation in neuromuscular control research. Code, models, checkpoints, and retargeted datasets are available at: https://github.com/amathislab/musclemimic

Paper Structure

This paper contains 52 sections, 12 equations, 19 figures, 9 tables.

Figures (19)

  • Figure 1: Visualization of the MyoBimanualArm model and MyoFullBody model, viewed from (A) front, (B) back, and (C) side on first and second row, respectively.
  • Figure 2: Total system throughput (Raw training Steps Per Second) as the number of parallel environments ($n$) scales from $16$ to $8192$ (the rest of the hyperparameters stay the same). Evaluated on an Intel Xeon Platinum 8570 CPU and a single NVIDIA H100 80GB GPU. Training was evaluated with a fixed number of mini-batches of 32 and 50 steps per rollout. With $8192$ environments, the throughput increases by around $7800\%$.
  • Figure 3: Effect of gradient epochs ($E$) on training stability. We compare $E=1$ (truly on-policy), $E=3$, and $E=10$ (aggressive sample reuse). (A) Early training (first 30M steps): higher $E$ accelerates initial learning due to more gradient updates per sample. (B) Full training trajectory: $E=1$ achieves superior asymptotic performance while $E=3$ and $E=10$ plateau or collapse. (C) KL divergence between current and data-generating policy distributions (log scale); with the same amount of clipping, $E>1$ exhibits catastrophic distribution shift with spikes exceeding $10^{10}$, whereas $E=1$ remains stable below $10^{-1}$.
  • Figure 4: Effect of minibatch size on training dynamics. We compare minibatch sizes of 32, 64, and 128. (A) Performance: larger batch sizes achieve higher asymptotic rewards. (B) Exploration stability: smaller batches cause the policy standard deviation to overshoot, while larger batches maintain stable convergence near the initialization. (C) Policy update magnitude (log scale): larger batches yield lower KL divergence throughout training, indicating more conservative and stable policy updates.
  • Figure 5: Motion snapshots from pre-trained MyoBimanualArm policies (fingers disabled). From top to bottom: lifting objects, throwing a ball, waving, and pouring then placing water.
  • ...and 14 more figures