Table of Contents
Fetching ...

Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports

Yi Xu, Yun Fu

TL;DR

UniTraj tackles the fragmentation of multi-agent trajectory tasks by formulating them as unified masked-trajectory generation problems. It introduces GSM to summarize spatial patterns and extends Mamba with a Bidirectional Temporal Scaled module to capture long-range temporal dependencies, all within a CVAE framework. The approach achieves state-of-the-art results across three sports benchmarks (Basketball-U, Football-U, Soccer-U) and includes comprehensive ablations and qualitative analyses. By providing open datasets, code, and models, the work offers a practical, scalable path toward robust, task-flexible trajectory understanding in real-world sports and related domains.

Abstract

Understanding multi-agent movement is critical across various fields. The conventional approaches typically focus on separate tasks such as trajectory prediction, imputation, or spatial-temporal recovery. Considering the unique formulation and constraint of each task, most existing methods are tailored for only one, limiting the ability to handle multiple tasks simultaneously, which is a common requirement in real-world scenarios. Another limitation is that widely used public datasets mainly focus on pedestrian movements with casual, loosely connected patterns, where interactions between individuals are not always present, especially at a long distance, making them less representative of more structured environments. To overcome these limitations, we propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs, adaptable to diverse scenarios in the domain of sports games. Specifically, we introduce a Ghost Spatial Masking (GSM) module, embedded within a Transformer encoder, for spatial feature extraction. We further extend recent State Space Models (SSMs), known as the Mamba model, into a Bidirectional Temporal Mamba (BTM) to better capture temporal dependencies. Additionally, we incorporate a Bidirectional Temporal Scaled (BTS) module to thoroughly scan trajectories while preserving temporal missing relationships. Furthermore, we curate and benchmark three practical sports datasets, Basketball-U, Football-U, and Soccer-U, for evaluation. Extensive experiments demonstrate the superior performance of our model. We hope that our work can advance the understanding of human movement in real-world applications, particularly in sports. Our datasets, code, and model weights are available here https://github.com/colorfulfuture/UniTraj-pytorch.

Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports

TL;DR

UniTraj tackles the fragmentation of multi-agent trajectory tasks by formulating them as unified masked-trajectory generation problems. It introduces GSM to summarize spatial patterns and extends Mamba with a Bidirectional Temporal Scaled module to capture long-range temporal dependencies, all within a CVAE framework. The approach achieves state-of-the-art results across three sports benchmarks (Basketball-U, Football-U, Soccer-U) and includes comprehensive ablations and qualitative analyses. By providing open datasets, code, and models, the work offers a practical, scalable path toward robust, task-flexible trajectory understanding in real-world sports and related domains.

Abstract

Understanding multi-agent movement is critical across various fields. The conventional approaches typically focus on separate tasks such as trajectory prediction, imputation, or spatial-temporal recovery. Considering the unique formulation and constraint of each task, most existing methods are tailored for only one, limiting the ability to handle multiple tasks simultaneously, which is a common requirement in real-world scenarios. Another limitation is that widely used public datasets mainly focus on pedestrian movements with casual, loosely connected patterns, where interactions between individuals are not always present, especially at a long distance, making them less representative of more structured environments. To overcome these limitations, we propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs, adaptable to diverse scenarios in the domain of sports games. Specifically, we introduce a Ghost Spatial Masking (GSM) module, embedded within a Transformer encoder, for spatial feature extraction. We further extend recent State Space Models (SSMs), known as the Mamba model, into a Bidirectional Temporal Mamba (BTM) to better capture temporal dependencies. Additionally, we incorporate a Bidirectional Temporal Scaled (BTS) module to thoroughly scan trajectories while preserving temporal missing relationships. Furthermore, we curate and benchmark three practical sports datasets, Basketball-U, Football-U, and Soccer-U, for evaluation. Extensive experiments demonstrate the superior performance of our model. We hope that our work can advance the understanding of human movement in real-world applications, particularly in sports. Our datasets, code, and model weights are available here https://github.com/colorfulfuture/UniTraj-pytorch.
Paper Structure (42 sections, 11 equations, 5 figures, 12 tables)

This paper contains 42 sections, 11 equations, 5 figures, 12 tables.

Figures (5)

  • Figure 1: Demonstration of three trajectory modeling tasks, trajectory prediction, imputation, and spatial-temporal (ST) recovery, for multi-agent movement analysis during an offensive possession in a basketball game, where each task takes different inputs.
  • Figure 2: Overall architecture of our UniTraj model. The encoders extract agent features and derive latent variables, while the decoder generates the complete trajectory using the sampled latent variables and agent features.
  • Figure 3: Detailed architecture of the encoding process, which consists of two main components: a Transformer encoder equipped with the GSM model, and a Mamba-based encoder featuring the BTS module. These components are designed to capture comprehensive spatial-temporal features and enable the model to learn missing patterns, thus generalizing to various missing situations.
  • Figure 4: Qualitative comparison between advanced baselines and our method. The ball's trajectory is shown in purple, offensive players are in green, and defensive players are in blue. Red "x" marks indicate masked locations and the starting points of the trajectories are highlighted with yellow stars.
  • Figure 5: Qualitative comparison between advanced baselines and our method. The ball's trajectory is shown in purple, offensive players are in green, and defensive players are in blue. Red "x" marks indicate masked locations and the starting points of the trajectories are highlighted with yellow stars.