Beyond Expected Goals: A Probabilistic Framework for Shot Occurrences in Soccer
Jonathan Pipping, Tianshu Feng, R. Paul Sabin
TL;DR
This work extends traditional expected goals (xG) by introducing xG+, a possession-level framework that jointly models shot-creation and shot-conversion probabilities. By aggregating across a possession, xG+ accounts for near-misses and sequential attack dynamics, addressing the conditioning-on-shots limitation of standard xG. Using Gradient Sports EPL tracking data and XGBoost, the authors show that xG+ improves team-level predictions and yields more persistent player signals than xG alone. The study provides insights into feature importance, validates across seasons, and outlines future directions including sequence modeling and defensive credit.
Abstract
Expected goals (xG) models estimate the probability that a shot results in a goal from its context (e.g., location, pressure), but they operate only on observed shots. We propose xG+, a possession-level framework that first estimates the probability that a shot occurs within the next second and its corresponding xG if it were to occur. We also introduce ways to aggregate this joint probability estimate over the course of a possession. By jointly modeling shot-taking behavior and shot quality, xG+ remedies the conditioning-on-shots limitation of standard xG. We show that this improves predictive accuracy at the team level and produces a more persistent player skill signal than standard xG models.
