OpenSTARLab: Open Approach for Spatio-Temporal Agent Data Analysis in Soccer
Calvin Yeung, Kenjiro Ide, Taiga Someya, Keisuke Fujii
TL;DR
OpenSTARLab tackles the data accessibility and interoperability bottlenecks in soccer analytics by delivering an open-source framework that standardizes event and tracking data via the UIED and SAR formats, provides a Spatial-Temporal Event labeling tool, and integrates deep learning as well as multi-agent reinforcement learning workflows. The Pre-processing, Event Modeling, and RLearn packages form a cohesive pipeline enabling data from multiple providers to be annotated, harmonized, modeled, simulated, and visualized, with empirical results showing superior action and time prediction for the LEM family of models and informative RL insights. The work demonstrates practical impact by enabling researchers to benchmark predictive models, simulate event dynamics, and analyze decision-making in soccer, contributing to more accessible, reproducible, and scalable analytics. Overall, OpenSTARLab advances democratization, collaboration, and innovation in soccer data science, while outlining limitations and concrete avenues for extending to other sports and richer player evaluations.
Abstract
Sports analytics has become both more professional and sophisticated, driven by the growing availability of detailed performance data. This progress enables applications such as match outcome prediction, player scouting, and tactical analysis. In soccer, the effective utilization of event and tracking data is fundamental for capturing and analyzing the dynamics of the game. However, there are two primary challenges: the limited availability of event data, primarily restricted to top-tier teams and leagues, and the scarcity and high cost of tracking data, which complicates its integration with event data for comprehensive analysis. Here we propose OpenSTARLab, an open-source framework designed to democratize spatio-temporal agent data analysis in sports by addressing these key challenges. OpenSTARLab includes the Pre-processing Package that standardizes event and tracking data through Unified and Integrated Event Data and State-Action-Reward formats, the Event Modeling Package that implements deep learning-based event prediction, alongside the RLearn Package for reinforcement learning tasks. These technical components facilitate the handling of diverse data sources and support advanced analytical tasks, thereby enhancing the overall functionality and usability of the framework. To assess OpenSTARLab's effectiveness, we conducted several experimental evaluations. These demonstrate the superior performance of the specific event prediction model in terms of action and time prediction accuracies and maintained its robust event simulation performance. Furthermore, reinforcement learning experiments reveal a trade-off between action accuracy and temporal difference loss and show comprehensive visualization. Overall, OpenSTARLab serves as a robust platform for researchers and practitioners, enhancing innovation and collaboration in the field of soccer data analytics.
