Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework
Jipeng Kong, Xinzhe Liu, Yuhang Lin, Jinrui Han, Sören Schwertfeger, Chenjia Bai, Xuelong Li
TL;DR
The paper addresses robust humanoid soccer skill learning by integrating perception and action through PAiD, a progressive three-stage framework. It advances Stage I motion-skill acquisition via human motion tracking, Stage II lightweight perception-guided generalization, and Stage III physics-aware sim-to-real transfer to bridge reality gaps, with policy optimization via PPO on a finite-horizon MDP. The authors introduce a unified motion-tracking approach with adaptive sampling, a lightweight perception-reward scheme for generalization, and a physics-informed DR and CMA-ES-based system identification to align ball dynamics across real and simulated environments, achieving $J(\theta)=\mathbb{E}[\sum_{t=0}^{T-1} \gamma^t r_t]$ optimization. On the Unitree G1, PAiD yields high-fidelity, human-like kicking, reporting static success of $91.3\%$ and rolling success of $71.9\%$, with strong real-world transfer and terrain robustness compared with baselines.
Abstract
Soccer presents a significant challenge for humanoid robots, demanding tightly integrated perception-action capabilities for tasks like perception-guided kicking and whole-body balance control. Existing approaches suffer from inter-module instability in modular pipelines or conflicting training objectives in end-to-end frameworks. We propose Perception-Action integrated Decision-making (PAiD), a progressive architecture that decomposes soccer skill acquisition into three stages: motion-skill acquisition via human motion tracking, lightweight perception-action integration for positional generalization, and physics-aware sim-to-real transfer. This staged decomposition establishes stable foundational skills, avoids reward conflicts during perception integration, and minimizes sim-to-real gaps. Experiments on the Unitree G1 demonstrate high-fidelity human-like kicking with robust performance under diverse conditions-including static or rolling balls, various positions, and disturbances-while maintaining consistent execution across indoor and outdoor scenarios. Our divide-and-conquer strategy advances robust humanoid soccer capabilities and offers a scalable framework for complex embodied skill acquisition. The project page is available at https://soccer-humanoid.github.io/.
