Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses
Qingyu Xiao, Zulfiqar Zaidi, Matthew Gombolay
TL;DR
This work tackles the challenge of rapidly localizing and predicting tennis ball trajectories under Magnus effects using an asynchronous, multi-camera factor-graph framework. It fuses camera observations with physics-based factors (projection, motion, aerodynamics, and bounce) and incorporates spin priors computed from human poses via a Temporal Convolutional Network (TCN) to enhance early-state inference. The approach, implemented with GTSAM and ISAM2 for incremental inference, achieves a substantial RMSE reduction (up to 63.6%) in landing position predictions compared to baseline adaptive EKF methods, and reports a spin-prior RMSE of 5.27 Hz on validation. These results demonstrate the practical potential for real-time, pose-informed ball tracking in robotic tennis, enabling more reliable planning for ball-returning systems, while highlighting avenues for further improvement in spin estimation and bounce modeling.
Abstract
The rapid and precise localization and prediction of a ball are critical for developing agile robots in ball sports, particularly in sports like tennis characterized by high-speed ball movements and powerful spins. The Magnus effect induced by spin adds complexity to trajectory prediction during flight and bounce dynamics upon contact with the ground. In this study, we introduce an innovative approach that combines a multi-camera system with factor graphs for real-time and asynchronous 3D tennis ball localization. Additionally, we estimate hidden states like velocity and spin for trajectory prediction. Furthermore, to enhance spin inference early in the ball's flight, where limited observations are available, we integrate human pose data using a temporal convolutional network (TCN) to compute spin priors within the factor graph. This refinement provides more accurate spin priors at the beginning of the factor graph, leading to improved early-stage hidden state inference for prediction. Our result shows the trained TCN can predict the spin priors with RMSE of 5.27 Hz. Integrating TCN into the factor graph reduces the prediction error of landing positions by over 63.6% compared to a baseline method that utilized an adaptive extended Kalman filter.
