Table of Contents
Fetching ...

Reinforcement learning-based motion imitation for physiologically plausible musculoskeletal motor control

Merkourios Simos, Alberto Silvio Chiappa, Alexander Mathis

TL;DR

Reinforcement learning is applied to a full musculoskeletal locomotion model (MyoLeg) with $80$ muscles and $20$ DoF to learn a model-free motion imitation policy (KINESIS) trained on $1.9$ hours of KIT-Locomotion data. The approach achieves accurate motion imitation, zero-shot text-conditioned motion via a diffusion model, and fine-tuning for high-level tasks like target-reaching, while generating muscle activations that correlate with human EMG signals, addressing Bernstein's redundancy problem. Key methodological advances include latent time-correlated exploration (Lattice), hard negative mining to build specialized experts, and a Mixture-of-Experts gate to unify skills, all within a PPO framework. The results demonstrate strong imitation across locomotive skills, effective text-to-motion transfer, and meaningful physiological plausibility, enabling insights into human motor control and potential applications in animation, robotics, prosthetics, and rehabilitation, albeit with limitations such as the absence of upper-body dynamics and idealized proprioception.

Abstract

How do humans move? The quest to understand human motion has broad applications in numerous fields, ranging from computer animation and motion synthesis to neuroscience, human prosthetics and rehabilitation. Although advances in reinforcement learning (RL) have produced impressive results in capturing human motion using simplified humanoids, controlling physiologically accurate models of the body remains an open challenge. In this work, we present a model-free motion imitation framework (KINESIS) to advance the understanding of muscle-based motor control. Using a musculoskeletal model of the lower body with 80 muscle actuators and 20 DoF, we demonstrate that KINESIS achieves strong imitation performance on 1.9 hours of motion capture data, is controllable by natural language through pre-trained text-to-motion generative models, and can be fine-tuned to carry out high-level tasks such as target goal reaching. Importantly, KINESIS generates muscle activity patterns that correlate well with human EMG activity. The physiological plausibility makes KINESIS a promising model for tackling challenging problems in human motor control theory, which we highlight by investigating Bernstein's redundancy problem in the context of locomotion. Code, videos and benchmarks will be available at https://github.com/amathislab/Kinesis.

Reinforcement learning-based motion imitation for physiologically plausible musculoskeletal motor control

TL;DR

Reinforcement learning is applied to a full musculoskeletal locomotion model (MyoLeg) with muscles and DoF to learn a model-free motion imitation policy (KINESIS) trained on hours of KIT-Locomotion data. The approach achieves accurate motion imitation, zero-shot text-conditioned motion via a diffusion model, and fine-tuning for high-level tasks like target-reaching, while generating muscle activations that correlate with human EMG signals, addressing Bernstein's redundancy problem. Key methodological advances include latent time-correlated exploration (Lattice), hard negative mining to build specialized experts, and a Mixture-of-Experts gate to unify skills, all within a PPO framework. The results demonstrate strong imitation across locomotive skills, effective text-to-motion transfer, and meaningful physiological plausibility, enabling insights into human motor control and potential applications in animation, robotics, prosthetics, and rehabilitation, albeit with limitations such as the absence of upper-body dynamics and idealized proprioception.

Abstract

How do humans move? The quest to understand human motion has broad applications in numerous fields, ranging from computer animation and motion synthesis to neuroscience, human prosthetics and rehabilitation. Although advances in reinforcement learning (RL) have produced impressive results in capturing human motion using simplified humanoids, controlling physiologically accurate models of the body remains an open challenge. In this work, we present a model-free motion imitation framework (KINESIS) to advance the understanding of muscle-based motor control. Using a musculoskeletal model of the lower body with 80 muscle actuators and 20 DoF, we demonstrate that KINESIS achieves strong imitation performance on 1.9 hours of motion capture data, is controllable by natural language through pre-trained text-to-motion generative models, and can be fine-tuned to carry out high-level tasks such as target goal reaching. Importantly, KINESIS generates muscle activity patterns that correlate well with human EMG activity. The physiological plausibility makes KINESIS a promising model for tackling challenging problems in human motor control theory, which we highlight by investigating Bernstein's redundancy problem in the context of locomotion. Code, videos and benchmarks will be available at https://github.com/amathislab/Kinesis.

Paper Structure

This paper contains 30 sections, 8 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: KINESIS is a model-free RL policy for the musculoskeletal control of locomotion. Top left: Our policy is trained on a curated set of MoCap data focusing on locomotion, and can successfully imitate unseen motion clips of the same locomotion type. Top right: KINESIS can be deployed zero-shot on synthetic text-conditioned motion sequences. Bottom left: KINESIS is fine-tuned to a high-level goal target reaching task, and produces human-like motion according to real-time instructions. Bottom right: Muscle activity patterns produced during motion imitation (active muscles shown in orange, inactive muscles shown in black) correlate well with human electrophysiology data (EMG).
  • Figure 2: An illustration of the hard negative mining strategy. The first policy network (expert) is trained on the entire dataset, containing five locomotion skills of varying difficulty (here denoted as easy, medium, and hard). After a given number of training epochs, the motions that the first expert successfully imitates are removed from the dataset, and the reduced dataset is used to train a new copy of the expert. The process repeats until the dataset is empty. Finally, a MoE gating network is trained to select the appropriate expert for a specific time step, given the current proprioceptive state and target pose.
  • Figure 3: The musculoskeletal model is imitating a "Turn in Place" motion; a rendering of the reference motion is shown on the left.
  • Figure 4: The musculoskeletal model is imitating synthetic motions generated with MDM tevet2023human. Generated trajectories are shown on the top left. Top: "The person turned right". Bottom: "The person walked forward".
  • Figure 5: Performance heatmap of learned motions by different policy modules, clustered by locomotion skill and sorted by motion description.
  • ...and 7 more figures