Table of Contents
Fetching ...

BroadTrack: Broadcast Camera Tracking for Soccer

Floriane Magera, Thomas Hoyoux, Olivier Barnich, Marc Van Droogenbroeck

TL;DR

BroadTrack addresses the need for a robust, off-the-shelf camera-tracking solution for elevated soccer broadcast cameras by integrating a tailored broadcast-camera model that incorporates radial distortion and fixed tripod geometry with a multi-component tracking pipeline. The system combines field-markings detectors, optical flow, and a nonlinear Levenberg–Marquardt optimization driven by three losses, plus an online confidence measure to trigger reinitialization. It achieves state-of-the-art performance on SoccerNet datasets, notably improving JaC metrics and reprojection accuracy, and demonstrates strong robustness in long sequences with limited reinitializations. The work also provides practical insights into the benefits of physical camera modeling for tracking stability and discusses future enhancements, including dynamic tripod constraints and vertigo-effect modeling.

Abstract

Camera calibration and localization, sometimes simply named camera calibration, enables many applications in the context of soccer broadcasting, for instance regarding the interpretation and analysis of the game, or the insertion of augmented reality graphics for storytelling or refereeing purposes. To contribute to such applications, the research community has typically focused on single-view calibration methods, leveraging the near-omnipresence of soccer field markings in wide-angle broadcast views, but leaving all temporal aspects, if considered at all, to general-purpose tracking or filtering techniques. Only a few contributions have been made to leverage any domain-specific knowledge for this tracking task, and, as a result, there lacks a truly performant and off-the-shelf camera tracking system tailored for soccer broadcasting, specifically for elevated tripod-mounted cameras around the stadium. In this work, we present such a system capable of addressing the task of soccer broadcast camera tracking efficiently, robustly, and accurately, outperforming by far the most precise methods of the state-of-the-art. By combining the available open-source soccer field detectors with carefully designed camera and tripod models, our tracking system, BroadTrack, halves the mean reprojection error rate and gains more than 15% in terms of Jaccard index for camera calibration on the SoccerNet dataset. Furthermore, as the SoccerNet dataset videos are relatively short (30 seconds), we also present qualitative results on a 20-minute broadcast clip to showcase the robustness and the soundness of our system.

BroadTrack: Broadcast Camera Tracking for Soccer

TL;DR

BroadTrack addresses the need for a robust, off-the-shelf camera-tracking solution for elevated soccer broadcast cameras by integrating a tailored broadcast-camera model that incorporates radial distortion and fixed tripod geometry with a multi-component tracking pipeline. The system combines field-markings detectors, optical flow, and a nonlinear Levenberg–Marquardt optimization driven by three losses, plus an online confidence measure to trigger reinitialization. It achieves state-of-the-art performance on SoccerNet datasets, notably improving JaC metrics and reprojection accuracy, and demonstrates strong robustness in long sequences with limited reinitializations. The work also provides practical insights into the benefits of physical camera modeling for tracking stability and discusses future enhancements, including dynamic tripod constraints and vertigo-effect modeling.

Abstract

Camera calibration and localization, sometimes simply named camera calibration, enables many applications in the context of soccer broadcasting, for instance regarding the interpretation and analysis of the game, or the insertion of augmented reality graphics for storytelling or refereeing purposes. To contribute to such applications, the research community has typically focused on single-view calibration methods, leveraging the near-omnipresence of soccer field markings in wide-angle broadcast views, but leaving all temporal aspects, if considered at all, to general-purpose tracking or filtering techniques. Only a few contributions have been made to leverage any domain-specific knowledge for this tracking task, and, as a result, there lacks a truly performant and off-the-shelf camera tracking system tailored for soccer broadcasting, specifically for elevated tripod-mounted cameras around the stadium. In this work, we present such a system capable of addressing the task of soccer broadcast camera tracking efficiently, robustly, and accurately, outperforming by far the most precise methods of the state-of-the-art. By combining the available open-source soccer field detectors with carefully designed camera and tripod models, our tracking system, BroadTrack, halves the mean reprojection error rate and gains more than 15% in terms of Jaccard index for camera calibration on the SoccerNet dataset. Furthermore, as the SoccerNet dataset videos are relatively short (30 seconds), we also present qualitative results on a 20-minute broadcast clip to showcase the robustness and the soundness of our system.

Paper Structure

This paper contains 23 sections, 8 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Usual pan-tilt head for 25$\times$ lens broadcast camera. (taken from Sachtler2023Video25, others pan-tilt heads can be found online Vinten2023Vision250).
  • Figure 2: Tripod model. The center of rotation $\boldsymbol{T}$ remains fixed as the camera rotates, and the point $\boldsymbol{O}$ which belongs to the optical axis of the camera remains at a fixed $\delta$ distance of $\boldsymbol{T}$.
  • Figure 3: Camera parameters visualizations, best viewed on screen. The first row displays pan and focal length values along test sequences of the sn-gamestate dataset. \ref{['subfig:tracking:nbjw-f']} and \ref{['subfig:tracking:nbjw-pan']} show the jitter of the parameters extracted by NBJW compared to BroadTrack. \ref{['subfig:tracking:no-of-f']} shows that the optical flow helps to smooth focal length values. The second row displays the focal point $\boldsymbol{C}$ position in the $XY$ plane. \ref{['subfig:tracking:nbjw-pos-1']} and \ref{['subfig:tracking:nbjw-pos-2']} show that the focal point estimated by NBJW can travel up to $20$ meters along a single sequence, while our focal point remains in a close neighborhood of the estimated tripod position. Finally, \ref{['subfig:tracking:no-T-pos-2']} shows the benefit of including the tripod constraint on the camera position, which remains closer to the estimated center of rotation $\boldsymbol{T}$.
  • Figure 4: Qualitative evaluation of the 20 minutes sequence from the Bundesliga. One out of $10{,}000$ images is displayed. Soccer field markings are reprojected in red using the estimated camera parameters $\kappa$.