Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery
Chamath Ranasinghe, Uthayasanker Thayasivam
TL;DR
The paper tackles football player tracking under frequent occlusions and uniform similarity by proposing a lightweight pipeline that combines SAM-based initialization with CSRT tracking and jersey-color appearance for re-identification. It presents a team-aware approach that uses domain-specific cues to enhance occlusion recovery while maintaining computational efficiency suitable for post-match analysis. A multi-dimensional evaluation framework assesses speed, accuracy, and robustness, revealing strong performance in light/moderate occlusions but limited long-term re-acquisition, highlighting the need for memory-based re-identification. Practical guidelines are provided for deploying such systems under resource constraints, and the work points to future extensions with memory-enabled SAM variants and enhanced re-identification strategies for improved long-duration robustness.
Abstract
Football player tracking is challenged by frequent occlusions, similar appearances, and rapid motion in crowded scenes. This paper presents a lightweight SAM-based tracking method combining the Segment Anything Model (SAM) with CSRT trackers and jersey color-based appearance models. We propose a team-aware tracking system that uses SAM for precise initialization and HSV histogram-based re-identification to improve occlusion recovery. Our evaluation measures three dimensions: processing speed (FPS and memory), tracking accuracy (success rate and box stability), and robustness (occlusion recovery and identity consistency). Experiments on football video sequences show that the approach achieves 7.6-7.7 FPS with stable memory usage (~1880 MB), maintaining 100 percent tracking success in light occlusions and 90 percent in crowded penalty-box scenarios with 5 or more players. Appearance-based re-identification recovers 50 percent of heavy occlusions, demonstrating the value of domain-specific cues. Analysis reveals key trade-offs: the SAM + CSRT combination provides consistent performance across crowd densities but struggles with long-term occlusions where players leave the frame, achieving only 8.66 percent re-acquisition success. These results offer practical guidelines for deploying football tracking systems under resource constraints, showing that classical tracker-based methods work well with continuous visibility but require stronger re-identification mechanisms for extended absences.
