Large-Scale In-Game Outcome Forecasting for Match, Team and Players in Football using an Axial Transformer Neural Network
Michael Horton, Patrick Lucey
TL;DR
This work addresses the challenge of real-time, multi-action forecasting in football by introducing a novel axial transformer architecture that jointly models temporal dynamics and inter-agent interactions using multi-modal inputs. The model ingests player, team, and game-context features, producing end-of-match totals for numerous actions across all players and teams, with live in-game updates and low latency. Key contributions include a new axial-attention formulation equivalent to regular masked self-attention but with improved efficiency, comprehensive live-inference capability (~505 predictions per time-step, ~150 time-steps per game), and empirical evidence from a large-scale dataset demonstrating calibrated, consistent forecasts and the importance of temporal and inter-agent modeling. The approach enables scalable in-game analytics, tactical decision support, and potential live betting and broadcast applications, offering a practical path to real-time, multi-agent sports forecasting.
Abstract
Football (soccer) is a sport that is characterised by complex game play, where players perform a variety of actions, such as passes, shots, tackles, fouls, in order to score goals, and ultimately win matches. Accurately forecasting the total number of each action that each player will complete during a match is desirable for a variety of applications, including tactical decision-making, sports betting, and for television broadcast commentary and analysis. Such predictions must consider the game state, the ability and skill of the players in both teams, the interactions between the players, and the temporal dynamics of the game as it develops. In this paper, we present a transformer-based neural network that jointly and recurrently predicts the expected totals for thirteen individual actions at multiple time-steps during the match, and where predictions are made for each individual player, each team and at the game-level. The neural network is based on an \emph{axial transformer} that efficiently captures the temporal dynamics as the game progresses, and the interactions between the players at each time-step. We present a novel axial transformer design that we show is equivalent to a regular sequential transformer, and the design performs well experimentally. We show empirically that the model can make consistent and reliable predictions, and efficiently makes $\sim$75,000 live predictions at low latency for each game.
