Learning Human-like Locomotion Based on Biological Actuation and Rewards

Minkwan Kim; Yoonsang Lee

Learning Human-like Locomotion Based on Biological Actuation and Rewards

Minkwan Kim, Yoonsang Lee

TL;DR

The paper tackles teaching a musculoskeletal humanoid to walk in a human-like manner without reference motions or hand-crafted control rules. It combines a Hill-type musculotendon model with a deep reinforcement learning policy, using a dense energy reward based on metabolic energy ($MET$) early in training and a sparse energy reward based on energy per distance ($CoT$) later, along with a randomized initial posture to improve exploration. The key findings show that energy-based rewards are essential for guiding exploration toward plausible locomotion and that the initial starting pose critically affects learning outcomes; ablations highlight the importance of observing detailed muscle states. This approach offers a path toward more biologically realistic locomotion control in simulations, with potential implications for biomechanics, robotics, and rehabilitation research, and suggests future work to extend to running and jumping and to incorporate arm swing dynamics.

Abstract

We propose a method of learning a policy for human-like locomotion via deep reinforcement learning based on a human anatomical model, muscle actuation, and biologically inspired rewards, without any inherent control rules or reference motions. Our main ideas involve providing a dense reward using metabolic energy consumption at every step during the initial stages of learning and then transitioning to a sparse reward as learning progresses, and adjusting the initial posture of the human model to facilitate the exploration of locomotion. Additionally, we compared and analyzed differences in learning outcomes across various settings other than the proposed method.

Learning Human-like Locomotion Based on Biological Actuation and Rewards

TL;DR

) early in training and a sparse energy reward based on energy per distance (

) later, along with a randomized initial posture to improve exploration. The key findings show that energy-based rewards are essential for guiding exploration toward plausible locomotion and that the initial starting pose critically affects learning outcomes; ablations highlight the importance of observing detailed muscle states. This approach offers a path toward more biologically realistic locomotion control in simulations, with potential implications for biomechanics, robotics, and rehabilitation research, and suggests future work to extend to running and jumping and to incorporate arm swing dynamics.

Abstract

Paper Structure (11 sections, 2 equations, 2 figures)

This paper contains 11 sections, 2 equations, 2 figures.

Introduction
Simulation and Learning
Design for Human-like Locomotion
Energy Rewards.
Initial Posture.
Results
Dense and Sparse Energy Rewards.
Importance of Energy Rewards.
Importance of Initial Posture.
Ablation Study.
Conclusions and Future Works

Figures (2)

Figure 1: Experimental results. From top to bottom: Ours (row 1), Dense energy (MET) only (row 2), Sparse energy (CoT) only (row 3), Without energy reward (row 4), Start with a double stance pose (row 5), Dense activation reward (row 6), Without muscle fiber length in observation (row 7)
Figure 2: Hill-type muscle model

Learning Human-like Locomotion Based on Biological Actuation and Rewards

TL;DR

Abstract

Learning Human-like Locomotion Based on Biological Actuation and Rewards

Authors

TL;DR

Abstract

Table of Contents

Figures (2)