Table of Contents
Fetching ...

A General Framework for Jersey Number Recognition in Sports Video

Maria Koshkina, James H. Elder

TL;DR

This work reframes jersey-number recognition in sports videos as a scene-text-recognition problem and proposes a robust, low-fine-tuning pipeline that operates at both image and tracklet levels. It combines a legibility classifier, pose-based torso localization, and a fine-tuned STR model (PARSeq) to detect and recognize jersey numbers, while extending to tracklets through main-subject filtering and prediction consolidation. A new hockey image-level jersey-number dataset enables evaluation, and the pipeline demonstrates strong cross-domain performance, achieving 91.4% accuracy on hockey image-level data and 87.45% on SoccerNet tracklets, with 79.31% on the challenge partition. The approach generalizes across sports and camera geometries, offering a practical path toward integrated jersey-number recognition in automated player tracking and sports analytics.

Abstract

Jersey number recognition is an important task in sports video analysis, partly due to its importance for long-term player tracking. It can be viewed as a variant of scene text recognition. However, there is a lack of published attempts to apply scene text recognition models on jersey number data. Here we introduce a novel public jersey number recognition dataset for hockey and study how scene text recognition methods can be adapted to this problem. We address issues of occlusions and assess the degree to which training on one sport (hockey) can be generalized to another (soccer). For the latter, we also consider how jersey number recognition at the single-image level can be aggregated across frames to yield tracklet-level jersey number labels. We demonstrate high performance on image- and tracklet-level tasks, achieving 91.4% accuracy for hockey images and 87.4% for soccer tracklets. Code, models, and data are available at https://github.com/mkoshkina/jersey-number-pipeline.

A General Framework for Jersey Number Recognition in Sports Video

TL;DR

This work reframes jersey-number recognition in sports videos as a scene-text-recognition problem and proposes a robust, low-fine-tuning pipeline that operates at both image and tracklet levels. It combines a legibility classifier, pose-based torso localization, and a fine-tuned STR model (PARSeq) to detect and recognize jersey numbers, while extending to tracklets through main-subject filtering and prediction consolidation. A new hockey image-level jersey-number dataset enables evaluation, and the pipeline demonstrates strong cross-domain performance, achieving 91.4% accuracy on hockey image-level data and 87.45% on SoccerNet tracklets, with 79.31% on the challenge partition. The approach generalizes across sports and camera geometries, offering a practical path toward integrated jersey-number recognition in automated player tracking and sports analytics.

Abstract

Jersey number recognition is an important task in sports video analysis, partly due to its importance for long-term player tracking. It can be viewed as a variant of scene text recognition. However, there is a lack of published attempts to apply scene text recognition models on jersey number data. Here we introduce a novel public jersey number recognition dataset for hockey and study how scene text recognition methods can be adapted to this problem. We address issues of occlusions and assess the degree to which training on one sport (hockey) can be generalized to another (soccer). For the latter, we also consider how jersey number recognition at the single-image level can be aggregated across frames to yield tracklet-level jersey number labels. We demonstrate high performance on image- and tracklet-level tasks, achieving 91.4% accuracy for hockey images and 87.4% for soccer tracklets. Code, models, and data are available at https://github.com/mkoshkina/jersey-number-pipeline.
Paper Structure (25 sections, 1 equation, 7 figures, 9 tables)

This paper contains 25 sections, 1 equation, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Pipeline of image-level jersey number detection and recognition.
  • Figure 2: Sample images from Hockey and SoccerNet datasets.
  • Figure 3: Hockey Dataset jersey number distribution.
  • Figure 4: SoccerNet Dataset jersey number distribution.
  • Figure 5: Sample jersey number crops automatically extracted from player images.
  • ...and 2 more figures