Table of Contents
Fetching ...

Expandable Decision-Making States for Multi-Agent Deep Reinforcement Learning in Soccer Tactical Analysis

Kenjiro Ide, Taiga Someya, Kohei Kawaguchi, Keisuke Fujii

TL;DR

This work tackles the challenge of analyzing tactical play in soccer with interpretable, agent‑level decisions in a high‑dimensional, multi‑agent setting. It introduces Expandable Decision‑Making States (EDMS), a semantically rich state representation with an action masking scheme that differentiates on‑ball and off‑ball decision spaces, and extends rewards with EPV signals to better propagate goal probabilities ($L_{td}$, $L_{as}$, $L_{L1}$ are minimized under $ abla=1$). Empirical results show that EDMS with masking reduces action‑prediction loss and TD error, while qualitative analyses reveal Tactical patterns such as fast counters and defensive breakthroughs, with robust cross‑dataset performance via the OpenSTARLab RLearn library. The approach enables cross‑provider benchmarking and reproducible evaluation, offering a practical path toward data‑driven tactical insight and coaching support in real matches.

Abstract

Invasion team sports such as soccer produce a high-dimensional, strongly coupled state space as many players continuously interact on a shared field, challenging quantitative tactical analysis. Traditional rule-based analyses are intuitive, while modern predictive machine learning models often perform pattern-matching without explicit agent representations. The problem we address is how to build player-level agent models from data, whose learned values and policies are both tactically interpretable and robust across heterogeneous data sources. Here, we propose Expandable Decision-Making States (EDMS), a semantically enriched state representation that augments raw positions and velocities with relational variables (e.g., scoring of space, pass, and score), combined with an action-masking scheme that gives on-ball and off-ball agents distinct decision sets. Compared to prior work, EDMS maps learned value functions and action policies to human-interpretable tactical concepts (e.g., marking pressure, passing lanes, ball accessibility) instead of raw coordinate features, and aligns agent choices with the rules of play. In the experiments, EDMS with action masking consistently reduced both action-prediction loss and temporal-difference (TD) error compared to the baseline. Qualitative case studies and Q-value visualizations further indicate that EDMS highlights high-risk, high-reward tactical patterns (e.g., fast counterattacks and defensive breakthroughs). We also integrated our approach into an open-source library and demonstrated compatibility with multiple commercial and open datasets, enabling cross-provider evaluation and reproducible experiments.

Expandable Decision-Making States for Multi-Agent Deep Reinforcement Learning in Soccer Tactical Analysis

TL;DR

This work tackles the challenge of analyzing tactical play in soccer with interpretable, agent‑level decisions in a high‑dimensional, multi‑agent setting. It introduces Expandable Decision‑Making States (EDMS), a semantically rich state representation with an action masking scheme that differentiates on‑ball and off‑ball decision spaces, and extends rewards with EPV signals to better propagate goal probabilities (, , are minimized under ). Empirical results show that EDMS with masking reduces action‑prediction loss and TD error, while qualitative analyses reveal Tactical patterns such as fast counters and defensive breakthroughs, with robust cross‑dataset performance via the OpenSTARLab RLearn library. The approach enables cross‑provider benchmarking and reproducible evaluation, offering a practical path toward data‑driven tactical insight and coaching support in real matches.

Abstract

Invasion team sports such as soccer produce a high-dimensional, strongly coupled state space as many players continuously interact on a shared field, challenging quantitative tactical analysis. Traditional rule-based analyses are intuitive, while modern predictive machine learning models often perform pattern-matching without explicit agent representations. The problem we address is how to build player-level agent models from data, whose learned values and policies are both tactically interpretable and robust across heterogeneous data sources. Here, we propose Expandable Decision-Making States (EDMS), a semantically enriched state representation that augments raw positions and velocities with relational variables (e.g., scoring of space, pass, and score), combined with an action-masking scheme that gives on-ball and off-ball agents distinct decision sets. Compared to prior work, EDMS maps learned value functions and action policies to human-interpretable tactical concepts (e.g., marking pressure, passing lanes, ball accessibility) instead of raw coordinate features, and aligns agent choices with the rules of play. In the experiments, EDMS with action masking consistently reduced both action-prediction loss and temporal-difference (TD) error compared to the baseline. Qualitative case studies and Q-value visualizations further indicate that EDMS highlights high-risk, high-reward tactical patterns (e.g., fast counterattacks and defensive breakthroughs). We also integrated our approach into an open-source library and demonstrated compatibility with multiple commercial and open datasets, enabling cross-provider evaluation and reproducible experiments.

Paper Structure

This paper contains 23 sections, 5 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: This figure illustrates the procedure for calculating the space score. From left to right, it shows: the Voronoi diagram (spatial area) considering each player's position and velocity; the importance of each area on the pitch; and the final space score distribution, which is the product of the two. The area importance is modeled based on proximity to the opponent's goal (sigmoid function) and distance from the center of the pitch (Gaussian function), and it is maximized in the central area in front of the opponent's goal
  • Figure 2: Relationship between each team's Q-value and various attacking metrics. The left plot shows the relationship with the total season Expected Goal (xG), and the right plot shows the relationship with the difference (Diff) between total season goals and xG.
  • Figure 3: A build-up scene from the match between Shimizu S-Pulse and Avispa Fukuoka. The left plot shows the positions of all players and the ball, with Avispa Fukuoka attacking from left to right. The right plot shows the respective Q-values for the eight directional movements of the players enclosed in the yellow box on the left.
  • Figure 4: An offside situation from the match between Kyoto Sanga and Sagan Tosu, showing the moment before the pass was made. The figure layout is the same as in Figure \ref{['fig:buildup_qvalue']}
  • Figure 5: The same offside situation as in Figure \ref{['fig:offside1_qvalue']}, showing the moment after the pass was made. The figure layout is the same as in Figure \ref{['fig:buildup_qvalue']}.
  • ...and 4 more figures