Table of Contents
Fetching ...

OpenSTARLab: Open Approach for Spatio-Temporal Agent Data Analysis in Soccer

Calvin Yeung, Kenjiro Ide, Taiga Someya, Keisuke Fujii

TL;DR

OpenSTARLab tackles the data accessibility and interoperability bottlenecks in soccer analytics by delivering an open-source framework that standardizes event and tracking data via the UIED and SAR formats, provides a Spatial-Temporal Event labeling tool, and integrates deep learning as well as multi-agent reinforcement learning workflows. The Pre-processing, Event Modeling, and RLearn packages form a cohesive pipeline enabling data from multiple providers to be annotated, harmonized, modeled, simulated, and visualized, with empirical results showing superior action and time prediction for the LEM family of models and informative RL insights. The work demonstrates practical impact by enabling researchers to benchmark predictive models, simulate event dynamics, and analyze decision-making in soccer, contributing to more accessible, reproducible, and scalable analytics. Overall, OpenSTARLab advances democratization, collaboration, and innovation in soccer data science, while outlining limitations and concrete avenues for extending to other sports and richer player evaluations.

Abstract

Sports analytics has become both more professional and sophisticated, driven by the growing availability of detailed performance data. This progress enables applications such as match outcome prediction, player scouting, and tactical analysis. In soccer, the effective utilization of event and tracking data is fundamental for capturing and analyzing the dynamics of the game. However, there are two primary challenges: the limited availability of event data, primarily restricted to top-tier teams and leagues, and the scarcity and high cost of tracking data, which complicates its integration with event data for comprehensive analysis. Here we propose OpenSTARLab, an open-source framework designed to democratize spatio-temporal agent data analysis in sports by addressing these key challenges. OpenSTARLab includes the Pre-processing Package that standardizes event and tracking data through Unified and Integrated Event Data and State-Action-Reward formats, the Event Modeling Package that implements deep learning-based event prediction, alongside the RLearn Package for reinforcement learning tasks. These technical components facilitate the handling of diverse data sources and support advanced analytical tasks, thereby enhancing the overall functionality and usability of the framework. To assess OpenSTARLab's effectiveness, we conducted several experimental evaluations. These demonstrate the superior performance of the specific event prediction model in terms of action and time prediction accuracies and maintained its robust event simulation performance. Furthermore, reinforcement learning experiments reveal a trade-off between action accuracy and temporal difference loss and show comprehensive visualization. Overall, OpenSTARLab serves as a robust platform for researchers and practitioners, enhancing innovation and collaboration in the field of soccer data analytics.

OpenSTARLab: Open Approach for Spatio-Temporal Agent Data Analysis in Soccer

TL;DR

OpenSTARLab tackles the data accessibility and interoperability bottlenecks in soccer analytics by delivering an open-source framework that standardizes event and tracking data via the UIED and SAR formats, provides a Spatial-Temporal Event labeling tool, and integrates deep learning as well as multi-agent reinforcement learning workflows. The Pre-processing, Event Modeling, and RLearn packages form a cohesive pipeline enabling data from multiple providers to be annotated, harmonized, modeled, simulated, and visualized, with empirical results showing superior action and time prediction for the LEM family of models and informative RL insights. The work demonstrates practical impact by enabling researchers to benchmark predictive models, simulate event dynamics, and analyze decision-making in soccer, contributing to more accessible, reproducible, and scalable analytics. Overall, OpenSTARLab advances democratization, collaboration, and innovation in soccer data science, while outlining limitations and concrete avenues for extending to other sports and richer player evaluations.

Abstract

Sports analytics has become both more professional and sophisticated, driven by the growing availability of detailed performance data. This progress enables applications such as match outcome prediction, player scouting, and tactical analysis. In soccer, the effective utilization of event and tracking data is fundamental for capturing and analyzing the dynamics of the game. However, there are two primary challenges: the limited availability of event data, primarily restricted to top-tier teams and leagues, and the scarcity and high cost of tracking data, which complicates its integration with event data for comprehensive analysis. Here we propose OpenSTARLab, an open-source framework designed to democratize spatio-temporal agent data analysis in sports by addressing these key challenges. OpenSTARLab includes the Pre-processing Package that standardizes event and tracking data through Unified and Integrated Event Data and State-Action-Reward formats, the Event Modeling Package that implements deep learning-based event prediction, alongside the RLearn Package for reinforcement learning tasks. These technical components facilitate the handling of diverse data sources and support advanced analytical tasks, thereby enhancing the overall functionality and usability of the framework. To assess OpenSTARLab's effectiveness, we conducted several experimental evaluations. These demonstrate the superior performance of the specific event prediction model in terms of action and time prediction accuracies and maintained its robust event simulation performance. Furthermore, reinforcement learning experiments reveal a trade-off between action accuracy and temporal difference loss and show comprehensive visualization. Overall, OpenSTARLab serves as a robust platform for researchers and practitioners, enhancing innovation and collaboration in the field of soccer data analytics.

Paper Structure

This paper contains 29 sections, 3 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Overview of the OpenSTARLab application on soccer. The dashed rectangle indicates steps that may be required depending on the specific soccer match video. Soccer match image from SoccerNet deliege2021soccernet.
  • Figure 2: Interface of STE label tool. Soccer match image from SoccerNet deliege2021soccernet.
  • Figure 3: Configuration of STE label tool.
  • Figure 4: UIED standardized pitch coordinates format. The red, blue, and black points represent the positions of the attacking team, defending team, and event location, respectively. The pitch is oriented such that the attacking direction is always from the left ($x=0$) to the right ($x=105$).
  • Figure 5: Simulation performance of the LEM_3 model. The exact values for the simulation are available at https://github.com/open-starlab/Event/blob/main/event/sports/soccer/examples/simulation_loss.csv
  • ...and 4 more figures