Catching Spinning Table Tennis Balls in Simulation with End-to-End Curriculum Reinforcement Learning
Xiaoyi Hu, Yue Mao, Gang Wang, Qingdu Li, Jianwei Zhang, Yunfeng Ji
TL;DR
The paper tackles the difficulty of catching and returning highly spun table tennis balls by introducing an end-to-end curriculum reinforcement learning framework that progressively trains a robot to catch, hit, and aim balls. It couples a physics-informed spinning-ball collision model with trajectory-state-based rewards and a parallel valid rally trajectory generator, enabling efficient simulation-based training. A Real2Sim transfer pipeline then validates the learned policy in real-world-like trajectories, using perception and execution system adaptations to bridge sim-to-real gaps. The results show that curriculum RL combined with the spinning-ball collision model yields superior performance and generalizes to real spinning-ball scenarios and arbitrary target landing points, with potential applicability to other cyclical robotic tasks.
Abstract
The game of table tennis is renowned for its extremely high spin rate, but most table tennis robots today struggle to handle balls with such rapid spin. To address this issue, we have contributed a series of methods, including: 1. Curriculum Reinforcement Learning (RL): This method helps the table tennis robot learn to play table tennis progressively from easy to difficult tasks. 2. Analysis of Spinning Table Tennis Ball Collisions: We have conducted a physics-based analysis to generate more realistic trajectories of spinning table tennis balls after collision. 3. Definition of Trajectory States: The definition of trajectory states aids in setting up the reward function. 4. Selection of Valid Rally Trajectories: We have introduced a valid rally trajectory selection scheme to ensure that the robot's training is not influenced by abnormal trajectories. 5. Reality-to-Simulation (Real2Sim) Transfer: This scheme is employed to validate the trained robot's ability to handle spinning balls in real-world scenarios. With Real2Sim, the deployment costs for robotic reinforcement learning can be further reduced. Moreover, the trajectory-state-based reward function is not limited to table tennis robots; it can be generalized to a wide range of cyclical tasks. To validate our robot's ability to handle spinning balls, the Real2Sim experiments were conducted. For the specific video link of the experiment, please refer to the supplementary materials.
