Table of Contents
Fetching ...

Unified Uncertainty-Aware Diffusion for Multi-Agent Trajectory Modeling

Guillem Capellera, Antonio Rubio, Luis Ferraz, Antonio Agudo

TL;DR

U2Diff presents a unified diffusion framework for multi-agent trajectory completion that simultaneously provides per-state uncertainty estimates and mode-wise error probabilities. By augmenting the diffusion objective with a negative log-likelihood term and propagating latent variance to the state space, it delivers interpretable uncertainty at the trajectory level, while RankNN enables reliable ranking of multiple generated modes under a common prior. The approach employs a decoupled temporal-social architecture (Temporal Mamba and Social Transformer) and a post-processing RankNN to produce per-mode error probabilities that correlate with ground-truth SADE, achieving strong scene-level performance on four sports datasets for both completion and forecasting. The combination of uncertainty estimation and error-probability ranking enhances robustness and practical utility in real-world multi-agent settings such as sports analytics.

Abstract

Multi-agent trajectory modeling has primarily focused on forecasting future states, often overlooking broader tasks like trajectory completion, which are crucial for real-world applications such as correcting tracking data. Existing methods also generally predict agents' states without offering any state-wise measure of uncertainty. Moreover, popular multi-modal sampling methods lack any error probability estimates for each generated scene under the same prior observations, making it difficult to rank the predictions during inference time. We introduce U2Diff, a \textbf{unified} diffusion model designed to handle trajectory completion while providing state-wise \textbf{uncertainty} estimates jointly. This uncertainty estimation is achieved by augmenting the simple denoising loss with the negative log-likelihood of the predicted noise and propagating latent space uncertainty to the real state space. Additionally, we incorporate a Rank Neural Network in post-processing to enable \textbf{error probability} estimation for each generated mode, demonstrating a strong correlation with the error relative to ground truth. Our method outperforms the state-of-the-art solutions in trajectory completion and forecasting across four challenging sports datasets (NBA, Basketball-U, Football-U, Soccer-U), highlighting the effectiveness of uncertainty and error probability estimation. Video at https://youtu.be/ngw4D4eJToE

Unified Uncertainty-Aware Diffusion for Multi-Agent Trajectory Modeling

TL;DR

U2Diff presents a unified diffusion framework for multi-agent trajectory completion that simultaneously provides per-state uncertainty estimates and mode-wise error probabilities. By augmenting the diffusion objective with a negative log-likelihood term and propagating latent variance to the state space, it delivers interpretable uncertainty at the trajectory level, while RankNN enables reliable ranking of multiple generated modes under a common prior. The approach employs a decoupled temporal-social architecture (Temporal Mamba and Social Transformer) and a post-processing RankNN to produce per-mode error probabilities that correlate with ground-truth SADE, achieving strong scene-level performance on four sports datasets for both completion and forecasting. The combination of uncertainty estimation and error-probability ranking enhances robustness and practical utility in real-world multi-agent settings such as sports analytics.

Abstract

Multi-agent trajectory modeling has primarily focused on forecasting future states, often overlooking broader tasks like trajectory completion, which are crucial for real-world applications such as correcting tracking data. Existing methods also generally predict agents' states without offering any state-wise measure of uncertainty. Moreover, popular multi-modal sampling methods lack any error probability estimates for each generated scene under the same prior observations, making it difficult to rank the predictions during inference time. We introduce U2Diff, a \textbf{unified} diffusion model designed to handle trajectory completion while providing state-wise \textbf{uncertainty} estimates jointly. This uncertainty estimation is achieved by augmenting the simple denoising loss with the negative log-likelihood of the predicted noise and propagating latent space uncertainty to the real state space. Additionally, we incorporate a Rank Neural Network in post-processing to enable \textbf{error probability} estimation for each generated mode, demonstrating a strong correlation with the error relative to ground truth. Our method outperforms the state-of-the-art solutions in trajectory completion and forecasting across four challenging sports datasets (NBA, Basketball-U, Football-U, Soccer-U), highlighting the effectiveness of uncertainty and error probability estimation. Video at https://youtu.be/ngw4D4eJToE

Paper Structure

This paper contains 17 sections, 16 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Uncertainty-aware, unified and interpretable approach for trajectory modeling in multi-agent scenarios.U2Diff is a diffusion-based model capable of performing trajectory completion tasks such as forecasting, imputation or inferring totally unseen agents, while also jointly estimating state-wise uncertainty. RankNN is a post-processing operation that infers an error probability for each generated mode under the same prior observations, which is strongly correlated with the error related to the ground truth.
  • Figure 2: Evaluation of the NLL over the predicted distribution states in function of the starting denoising step $\hat{s}$ in which the variance starts propagating.
  • Figure 3: U2Diff architecture.Top: Decoupled temporal and social processing in each residual block. Bottom: Multi-scene attention processing and projection with Linear+ReLU+Softmax operations in RankNN to obtain the $K$ error probabilities $e$.
  • Figure 4: Qualitative comparisons in trajectory completion (top) and forecasting (bottom). Our U2Diff is compared with UniTraj xu2025sportstraj for trajectory completion and LED mao2023leapfrog for trajectory forecasting. Ground truth player locations are shown in bright blue and pink, and the ball in green. Model input observations are in white. The predicted mode with the best minSADE$_{20}$ is shown, with players in dark blue and pink, and the ball in yellow.
  • Figure 5: Qualitative evaluation of the error correlation.Top: In orange, the AvgUcty versus SADE across the 20 generated modes of a test scene example. In blue, the error probability $e$ versus SADE. Bottom: Distribution of Spearman correlation coefficients $\rho$ for all four test datasets, using AvgUcty in orange and RankNN predicting $e$ in blue.