Table of Contents
Fetching ...

Prediction-based evaluation of back-four defense with spatial control in soccer

Soujanya Dash, Kenjiro Ide, Rikuhei Umemoto, Kai Amino, Keisuke Fujii

TL;DR

Addressing the challenge of quantifying collective back-four defense during negative transitions in elite soccer, the study introduces interpretable spatio-temporal indicators (Space Score, Stretch Index, Pressure Index, and Defensive Line Height Absolute/Relative) derived from synchronized tracking and event data. Analyzing 2,413 defensive sequences from 73 LaLiga matches (Barcelona and Real Madrid) with two-way ANOVA and team-specific predictive models (XGBoost, Random Forest, SVC) reveals that Defensive Line Height relative to the ball is the strongest predictor of defensive success, with Space Score also playing a crucial role. Barcelona displays stronger, more consistent spatial control and line coordination, while Real Madrid shows more adaptive but less stable defensive structures. The work demonstrates that combining interpretable spatial metrics with inferential and predictive analyses provides actionable insights for coaching and real-time tactical analytics in elite soccer.

Abstract

Defensive organization is critical in soccer, particularly during negative transitions when teams are most vulnerable. The back-four defensive line plays a decisive role in preventing goal-scoring opportunities, yet its collective coordination remains difficult to quantify. This study introduces interpretable spatio-temporal indicators namely, space control, stretch index, pressure index, and defensive line height (absolute and relative) to evaluate the effectiveness of the back-four during defensive transitions. Using synchronized tracking and event data from the 2023-24 LaLiga season, 2,413 defensive sequences were analyzed following possession losses by FC Barcelona and Real Madrid CF. Two-way ANOVA revealed significant effects of team, outcome, and their interaction for key indicators, with relative line height showing the strongest association with defensive success. Predictive modeling using XGBoost achieved the highest discriminative performance (ROC AUC: 0.724 for Barcelona, 0.698 for Real Madrid), identifying space score and relative line height as dominant predictors. Comparative analysis revealed distinct team-specific defensive behaviors: Barcelona's success was characterized by higher spatial control and compact line coordination, whereas Real Madrid exhibited more adaptive but less consistent defensive structures. These findings demonstrate the tactical and predictive value of interpretable spatial indicators for quantifying collective defensive performance.

Prediction-based evaluation of back-four defense with spatial control in soccer

TL;DR

Addressing the challenge of quantifying collective back-four defense during negative transitions in elite soccer, the study introduces interpretable spatio-temporal indicators (Space Score, Stretch Index, Pressure Index, and Defensive Line Height Absolute/Relative) derived from synchronized tracking and event data. Analyzing 2,413 defensive sequences from 73 LaLiga matches (Barcelona and Real Madrid) with two-way ANOVA and team-specific predictive models (XGBoost, Random Forest, SVC) reveals that Defensive Line Height relative to the ball is the strongest predictor of defensive success, with Space Score also playing a crucial role. Barcelona displays stronger, more consistent spatial control and line coordination, while Real Madrid shows more adaptive but less stable defensive structures. The work demonstrates that combining interpretable spatial metrics with inferential and predictive analyses provides actionable insights for coaching and real-time tactical analytics in elite soccer.

Abstract

Defensive organization is critical in soccer, particularly during negative transitions when teams are most vulnerable. The back-four defensive line plays a decisive role in preventing goal-scoring opportunities, yet its collective coordination remains difficult to quantify. This study introduces interpretable spatio-temporal indicators namely, space control, stretch index, pressure index, and defensive line height (absolute and relative) to evaluate the effectiveness of the back-four during defensive transitions. Using synchronized tracking and event data from the 2023-24 LaLiga season, 2,413 defensive sequences were analyzed following possession losses by FC Barcelona and Real Madrid CF. Two-way ANOVA revealed significant effects of team, outcome, and their interaction for key indicators, with relative line height showing the strongest association with defensive success. Predictive modeling using XGBoost achieved the highest discriminative performance (ROC AUC: 0.724 for Barcelona, 0.698 for Real Madrid), identifying space score and relative line height as dominant predictors. Comparative analysis revealed distinct team-specific defensive behaviors: Barcelona's success was characterized by higher spatial control and compact line coordination, whereas Real Madrid exhibited more adaptive but less consistent defensive structures. These findings demonstrate the tactical and predictive value of interpretable spatial indicators for quantifying collective defensive performance.

Paper Structure

This paper contains 1 section, 5 equations, 4 figures, 6 tables.

Table of Contents

  1. Introduction

Figures (4)

  • Figure 1: Tactical zones used in the computation of the space score. The field is divided into four analytically defined zones prioritized by tactical importance: (1) Central Final Third (red), (2) Penalty Box Proximity (orange), (3) Wing Pockets (green), and (4) Ball-Carrier Radius (blue). Each zone is assigned a weight and dynamically assessed based on the number of defenders and attackers present. Zone overlaps are resolved by prioritizing higher-weighted zones. This framework enables frame-by-frame quantification of spatial control during defensive transitions.
  • Figure 2: Single-frame illustration of back-four extraction and derived geometric features. The four deepest outfield defenders (blue markers) form the back-four polygon (shaded). Three nearest attackers are shown in red and the ball as a yellow star. Annotated values include frame ID and timestamp (top left), compactness (convex-hull area) and pressure index (top left box), the line mean (cyan dashed line) and the positions bounding the hull (blue dotted lines). This frame (example ID: 60706) is presented to clarify how per-frame measures (convex hull area, line mean, defender–attacker distances and ball-relative line height) are computed prior to sequence aggregation.
  • Figure 3: Interaction effects between team and defensive success across features. Line height (relative) exhibits a clear crossover pattern, indicating differential team behavior during transitions.
  • Figure 4: Correlation heatmap of handcrafted defensive indicators. While some moderate correlations exist (e.g., line height and space score), multicollinearity remains acceptable across features.