Learning Human-Like Badminton Skills for Humanoid Robots

Yeke Chen; Shihao Dong; Xiaoyu Ji; Jingkai Sun; Zeren Luo; Liu Zhao; Jiahui Zhang; Wanyue Li; Ji Ma; Bowen Xu; Yimin Han; Yudong Zhao; Peng Lu

Learning Human-Like Badminton Skills for Humanoid Robots

Yeke Chen, Shihao Dong, Xiaoyu Ji, Jingkai Sun, Zeren Luo, Liu Zhao, Jiahui Zhang, Wanyue Li, Ji Ma, Bowen Xu, Yimin Han, Yudong Zhao, Peng Lu

TL;DR

The paper tackles the problem of enabling humanoid robots to perform human-like badminton by bridging kinesthetic imitation and dynamic interaction. It proposes a four-stage Imitation-to-Interaction framework that progressively transfers a robust motor prior from motion capture to a goal-conditioned, model-based policy, stabilizes it with Adversarial Motion Priors, and finally refines it in a physics-enabled environment to master interception and recovery. Key innovations include a compact state representation with Time-to-Hit and Target Hit/Recovery states, a manifold expansion strategy to convert sparse demonstrations into a dense interaction space, and a zero-shot sim-to-real transfer demonstrated on a humanoid robot. The approach yields diverse skills such as forehand and backhand lifts and drop shots, achieving real-world robustness despite hardware limitations and informing future work on stability-agility trade-offs and full-court play. Overall, the work advances end-to-end policy learning for high-dynamic sports in humanoids by integrating motion priors, goal-conditioned RL, adversarial style constraints, and physics-aware interaction.

Abstract

Realizing versatile and human-like performance in high-demand sports like badminton remains a formidable challenge for humanoid robotics. Unlike standard locomotion or static manipulation, this task demands a seamless integration of explosive whole-body coordination and precise, timing-critical interception. While recent advances have achieved lifelike motion mimicry, bridging the gap between kinematic imitation and functional, physics-aware striking without compromising stylistic naturalness is non-trivial. To address this, we propose Imitation-to-Interaction, a progressive reinforcement learning framework designed to evolve a robot from a "mimic" to a capable "striker." Our approach establishes a robust motor prior from human data, distills it into a compact, model-based state representation, and stabilizes dynamics via adversarial priors. Crucially, to overcome the sparsity of expert demonstrations, we introduce a manifold expansion strategy that generalizes discrete strike points into a dense interaction volume. We validate our framework through the mastery of diverse skills, including lifts and drop shots, in simulation. Furthermore, we demonstrate the first zero-shot sim-to-real transfer of anthropomorphic badminton skills to a humanoid robot, successfully replicating the kinetic elegance and functional precision of human athletes in the physical world.

Learning Human-Like Badminton Skills for Humanoid Robots

TL;DR

Abstract

Paper Structure (34 sections, 13 equations, 4 figures, 2 tables)

This paper contains 34 sections, 13 equations, 4 figures, 2 tables.

Introduction
Related Works
Humanoid Whole-Body Control
Dynamic Ball Sports in Legged Robotics
Goal-Conditioned Reinforcement Learning
Method
Overview
Motion Data Processing & Retargeting
Stage 1: Kinematic Motor Prior Learning (Teacher)
Stage 2: Goal-Conditioned Distillation (Student)
Goal-Conditioned State Representation
Reward Design
Stage 3: Motion Stabilization with RL
Stage 4: Interaction-Driven Refinement
Manifold Expansion
...and 19 more sections

Figures (4)

Figure 1: Real-world Deployment of the System. We present a learning-based framework that enables a humanoid to perform agile shuttlecock interceptions using a racket. The snapshots demonstrate the zero-shot Sim-to-Real transfer of two fundamental skills: (a, b) Forehand Lifts and (c, d) Backhand Lifts. The blue circles highlight the successful contact moments between the racket and the shuttlecock. Despite the complexity of the motions, our policy maintains robust balance and tracking accuracy on physical hardware.
Figure 2: Overview of the Framework. The pipeline progressively transforms a kinematic imitator into a dynamic striker through four stages: (Stage 1) Imitation: A teacher policy learns to robustly track human motions from MoCap data using proprioceptive (blue) and imitation goal (green) observations. (Stage 2) Distillation: The teacher's capabilities are distilled into a student policy via DAgger. The student operates on a reduced observation space consisting of proprioception, task goals (yellow: target hit/recovery states), and time-to-hit (red), removing dependency on future motion trajectories. (Stage 3) Stabilization: The student policy is fine-tuned using RL with an AMP discriminator to enforce stylistic plausibility (Style Reward) while minimizing tracking errors, stabilizing the motion against drift. (Stage 4) Interaction: In the final physics-interactive environment, the policy undergoes refinement with simulated shuttlecock dynamics, generalizing to a dense spatio-temporal manifold to achieve precise, agile striking.
Figure 3: Manifold Expansion of Strike Targets. The scattered points represent the discrete strike locations from the original MoCap dataset, color-coded by the time-to-hit (blue for shorter, red for longer durations) relative to the robot's initial pose. The semi-transparent cyan volume illustrates the expanded, continuous striking manifold achieved by our interaction-driven refinement stage. Our method empowers the robot to generalize from these sparse human demonstrations to a dense, volumetric region of reachable targets across varying temporal horizons.
Figure 4: Diverse Badminton Skills Learned via the Proposed Framework. Time-lapse sequences demonstrating the humanoid's mastery of distinct striking techniques. (a) Backhand Lift: The robot rotates its torso to generate "whipping" power for a cross-body return. (b) Forehand Lift: The robot executes an extended lunge to reach a distant target and quickly recovers balance. (c) Drop Shot: The robot performs an overhead strike with a natural inertial follow-through. Pink dots visualize the historical trajectory of the shuttlecock.

Learning Human-Like Badminton Skills for Humanoid Robots

TL;DR

Abstract

Learning Human-Like Badminton Skills for Humanoid Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (4)