Table of Contents
Fetching ...

Learning Group Interactions and Semantic Intentions for Multi-Object Trajectory Prediction

Mengshi Qi, Yuxin Yang, Huadong Ma

TL;DR

This work tackles multi-object trajectory prediction in sports by modeling both group-level interactions and dynamic semantic intentions. It introduces a diffusion-based framework conditioned on group tactics and uses Banzhaf Interaction within a cooperative game-theoretic module to capture intentions across agents and teams. Core components include the Interaction Encoder, Multi-Grained Feature Enhancement, and a Semantic Intention Prediction Module that learns agent-tactic affinities and predicts Top-$k$ tactics, supported by an expanded NBA SportVU dataset with tactic annotations. Empirical results on NBA SportVU and TeamTrack benchmarks achieve state-of-the-art trajectory and tactic prediction performance, demonstrating the benefits of integrating group-level knowledge and game-theoretic semantics into diffusion-based forecasting for sports analytics.

Abstract

Effective modeling of group interactions and dynamic semantic intentions is crucial for forecasting behaviors like trajectories or movements. In complex scenarios like sports, agents' trajectories are influenced by group interactions and intentions, including team strategies and opponent actions. To this end, we propose a novel diffusion-based trajectory prediction framework that integrates group-level interactions into a conditional diffusion model, enabling the generation of diverse trajectories aligned with specific group activity. To capture dynamic semantic intentions, we frame group interaction prediction as a cooperative game, using Banzhaf interaction to model cooperation trends. We then fuse semantic intentions with enhanced agent embeddings, which are refined through both global and local aggregation. Furthermore, we expand the NBA SportVU dataset by adding human annotations of team-level tactics for trajectory and tactic prediction tasks. Extensive experiments on three widely-adopted datasets demonstrate that our model outperforms state-of-the-art methods. Our source code and data are available at https://github.com/aurora-xin/Group2Int-trajectory.

Learning Group Interactions and Semantic Intentions for Multi-Object Trajectory Prediction

TL;DR

This work tackles multi-object trajectory prediction in sports by modeling both group-level interactions and dynamic semantic intentions. It introduces a diffusion-based framework conditioned on group tactics and uses Banzhaf Interaction within a cooperative game-theoretic module to capture intentions across agents and teams. Core components include the Interaction Encoder, Multi-Grained Feature Enhancement, and a Semantic Intention Prediction Module that learns agent-tactic affinities and predicts Top- tactics, supported by an expanded NBA SportVU dataset with tactic annotations. Empirical results on NBA SportVU and TeamTrack benchmarks achieve state-of-the-art trajectory and tactic prediction performance, demonstrating the benefits of integrating group-level knowledge and game-theoretic semantics into diffusion-based forecasting for sports analytics.

Abstract

Effective modeling of group interactions and dynamic semantic intentions is crucial for forecasting behaviors like trajectories or movements. In complex scenarios like sports, agents' trajectories are influenced by group interactions and intentions, including team strategies and opponent actions. To this end, we propose a novel diffusion-based trajectory prediction framework that integrates group-level interactions into a conditional diffusion model, enabling the generation of diverse trajectories aligned with specific group activity. To capture dynamic semantic intentions, we frame group interaction prediction as a cooperative game, using Banzhaf interaction to model cooperation trends. We then fuse semantic intentions with enhanced agent embeddings, which are refined through both global and local aggregation. Furthermore, we expand the NBA SportVU dataset by adding human annotations of team-level tactics for trajectory and tactic prediction tasks. Extensive experiments on three widely-adopted datasets demonstrate that our model outperforms state-of-the-art methods. Our source code and data are available at https://github.com/aurora-xin/Group2Int-trajectory.

Paper Structure

This paper contains 35 sections, 26 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustration of multi-agent trajectory prediction in a 3 vs. 3 basketball game. Agent trajectories are shown as circles, blue for one team, red for the opposing team, and green for the basketball. Solid circles are observed trajectories, while dashed ones are predicted positions. The blue team begins with a "Ball Movement" tactic, switching to "Single", while the red team uses "Man-to-Man Defense". Given the observed 2D trajectories and tactics, our goal is to (a) predict future trajectories and (b) forecast the tactics each team will adopt in the next frames.
  • Figure 2: The overview of our proposed method. It consists of two main parts: (1) denoising module for diffusion-based trajectory prediction and (2) semantic intention prediction module for team-level tactics. Specifically, the interaction encoder processes observed trajectories and group-level tactics to generate agent tokens that serve as conditions for diffusion model to predict future trajectories. The multi-grained feature enhancement module captures global and local information to enhance agent tokens. Banzhaf Interaction Learner predicts the similarity between agents and potential Top-k tactics, which can be viewed as semantic intentions, while Banzhaf Interaction Calculation computes ground truth for supervision. Finally, we fuse the enhanced agent tokens with the semantic intentions and feed this information into the prediction head to obtain tactic predictions. The blue background denotes team-level predictions.
  • Figure 3: 3D t-SNE visualized results of utilizing clustering to generate tactic pseudo-labels on NBA SportVU dataset.
  • Figure 4: Illustration of the team-level tactic annotation distribution of NBA SportVU dataset.
  • Figure 5: Illustration of visualization results of trajectory prediction on the NBA SportVU dataset, where the ground truth show in the last column, LED method mao2023leapfrog and our proposed approach is show in the first column and the second column, respectively. Main differences are zoom in for highlight.
  • ...and 2 more figures