Enhancing Multi-Camera Gymnast Tracking Through Domain Knowledge Integration
Fan Yang, Shigeyuki Odashima, Shoichi Masui, Ikuo Kusajima, Sosuke Yamao, Shan Jiang
TL;DR
This work tackles robust 3D gymnast tracking under limited cross-view observations for gymnastics judging. It introduces a domain-knowledge–driven cascaded data association that switches between triangulation and ray-plane intersection, leveraging the tendency of gymnasts to move within a predefined vertical plane. The approach, built on four calibrated RGB cameras and a multi-stage processing pipeline, shows substantial reductions in ID switches and pose-estimation errors compared with state-of-the-art baselines, especially when only two opposing views are available, and has been deployed at the Gymnastics World Championships. The method promises practical benefits for objective judging and broader sport-video analysis by integrating sport-specific constraints into multi-camera tracking.
Abstract
We present a robust multi-camera gymnast tracking, which has been applied at international gymnastics championships for gymnastics judging. Despite considerable progress in multi-camera tracking algorithms, tracking gymnasts presents unique challenges: (i) due to space restrictions, only a limited number of cameras can be installed in the gymnastics stadium; and (ii) due to variations in lighting, background, uniforms, and occlusions, multi-camera gymnast detection may fail in certain views and only provide valid detections from two opposing views. These factors complicate the accurate determination of a gymnast's 3D trajectory using conventional multi-camera triangulation. To alleviate this issue, we incorporate gymnastics domain knowledge into our tracking solution. Given that a gymnast's 3D center typically lies within a predefined vertical plane during \revised{much of their} performance, we can apply a ray-plane intersection to generate coplanar 3D trajectory candidates for opposing-view detections. More specifically, we propose a novel cascaded data association (DA) paradigm that employs triangulation to generate 3D trajectory candidates when cross-view detections are sufficient, and resort to the ray-plane intersection when they are insufficient. Consequently, coplanar candidates are used to compensate for uncertain trajectories, thereby minimizing tracking failures. The robustness of our method is validated through extensive experimentation, demonstrating its superiority over existing methods in challenging scenarios. Furthermore, our gymnastics judging system, equipped with this tracking method, has been successfully applied to recent Gymnastics World Championships, earning significant recognition from the International Gymnastics Federation.
