Diff-Muscle: Efficient Learning for Musculoskeletal Robotic Table Tennis

Wentao Zhao; Jun Guo; Kangyao Huang; Xin Liu; Huaping Liu

Diff-Muscle: Efficient Learning for Musculoskeletal Robotic Table Tennis

Wentao Zhao, Jun Guo, Kangyao Huang, Xin Liu, Huaping Liu

TL;DR

A hierarchical reinforcement learning framework that integrates a Kinematics-based Muscle Actuation Controller with high-level trajectory planning, enabling a musculoskeletal robot to perform dexterous and precise rallies, and significantly outperforms state-of-the-art baselines in success rates while maintaining minimal muscle activation.

Abstract

Musculoskeletal robots provide superior advantages in flexibility and dexterity, positioning them as a promising frontier towards embodied intelligence. However, current research is largely confined to relative simple tasks, restricting the exploration of their full potential in multi-segment coordination. Furthermore, efficient learning remains a challenge, primarily due to the high-dimensional action space and inherent overactuated structures. To address these challenges, we propose Diff-Muscle, a musculoskeletal robot control algorithm that leverages differential flatness to reformulate policy learning from the redundant muscle-activation space into a significantly lower-dimensional joint space. Furthermore, we utilize the highly dynamic robotic table tennis task to evaluate our algorithm. Specifically, we propose a hierarchical reinforcement learning framework that integrates a Kinematics-based Muscle Actuation Controller (K-MAC) with high-level trajectory planning, enabling a musculoskeletal robot to perform dexterous and precise rallies. Experimental results demonstrate that Diff-Muscle significantly outperforms state-of-the-art baselines in success rates while maintaining minimal muscle activation. Notably, the proposed framework successfully enables the musculoskeletal robots to achieve continuous rallies in a challenging dual-robot setting.

Diff-Muscle: Efficient Learning for Musculoskeletal Robotic Table Tennis

TL;DR

Abstract

Paper Structure (27 sections, 21 equations, 7 figures, 1 table)

This paper contains 27 sections, 21 equations, 7 figures, 1 table.

INTRODUCTION
RELATED WORKS
Musculoskeletal System Control
Robotic Table Tennis
PROBLEM FORMULATION
METHOD
Differential Flatness and Action Mapping
Conditional Differential Flatness
Kinematics-based Action Mapping
Physics-based Planner
Planner Construction
Reward Functions
Extend to Dual Robot Rally
EXPERIMENTS
Experimental Setup and Task Formulation
...and 12 more sections

Figures (7)

Figure 1: Diff-Muscle. (a) Utilizing the inherent differential flatness, Diff-Muscle reformulates the learning problem in musculoskeletal systems from high-dimensional muscle space to low-dimensional joint space. (b) We evaluate Diff-Muscle in the highly dynamic robotic table tennis task to demonstrate its capability and efficiency in learning multi-segment coordination and rapid reactive behaviors.
Figure 2: Hierarchical reinforcement learning framework. (a) In this framework, the policy takes as input current observation and high-level commands, and generates the target joint positions, which are subsequently translated into muscle signals through the Kinematics-based Muscle Actuation Controller (K-MAC). (b) We extend the framework and achieve successful consecutive dual-robot rallies. (c) Given the ball state, the planner predicts the desired racket position, orientation, and velocity at the predefined plane. (d) The Kinematics-based Muscle Actuation Controller integrates forward kinematics and a PD controller to translate target joint positions into muscle control signals.
Figure 3: Musculoskeletal Model and Environment. (a) The musculoskeletal model consists of MyoArm (63 muscles, 27 DoF), MyoTorso (210 muscles, 3 DoF), and pelvis (2 actuators, 2DoF). (b) Single-robot environment to execute a single-serve ball return. (c) dual-robot rally environment.
Figure 4: Learning curve.Diff-Muscle exhibits the most rapid improvement in reward and achieves the highest overall performance among all evaluated methods. Mean $\pm$ Std is computed across 5 random seeds.
Figure 5: Ablation results.Diff-Muscle exhibits the most rapid improvement in reward and achieves the highest overall performance among all evaluated methods. Mean $\pm$ Std is computed across 5 random seeds.
...and 2 more figures

Diff-Muscle: Efficient Learning for Musculoskeletal Robotic Table Tennis

TL;DR

Abstract

Diff-Muscle: Efficient Learning for Musculoskeletal Robotic Table Tennis

Authors

TL;DR

Abstract

Table of Contents

Figures (7)