Table of Contents
Fetching ...

Uncertainty-Aware Pedestrian Trajectory Prediction via Distributional Diffusion

Yao Liu, Zesheng Ye, Rui Wang, Binghao Li, Quan Z. Sheng, Lina Yao

TL;DR

This work tackles the inherent uncertainty and multi-modality in pedestrian trajectory prediction by introducing UPDD, a distributional diffusion framework that explicitly models predictive uncertainty via bi-variate Gaussian statistics over future locations. By mapping futures to sufficient statistics and applying guided diffusion conditioned on historical and social cues, UPDD separates self-uncertainty from multi-agent dynamics and enables efficient multi-modal sampling through a non-Markov forward process and accelerated deterministic reverse. The approach yields state-of-the-art or competitive results on ETH/UCY and SDD benchmarks, with substantial gains in sampling efficiency due to explicit density modeling and diffusion-chain shortening. This has practical impact for real-time, uncertainty-aware planning in autonomous systems and sets the stage for broader diffusion-based sequence forecasting with explicit predictive densities.

Abstract

Tremendous efforts have been put forth on predicting pedestrian trajectory with generative models to accommodate uncertainty and multi-modality in human behaviors. An individual's inherent uncertainty, e.g., change of destination, can be masked by complex patterns resulting from the movements of interacting pedestrians. However, latent variable-based generative models often entangle such uncertainty with complexity, leading to limited either latent expressivity or predictive diversity. In this work, we propose to separately model these two factors by implicitly deriving a flexible latent representation to capture intricate pedestrian movements, while integrating predictive uncertainty of individuals with explicit bivariate Gaussian mixture densities over their future locations. More specifically, we present a model-agnostic uncertainty-aware pedestrian trajectory prediction framework, parameterizing sufficient statistics for the mixture of Gaussians that jointly comprise the multi-modal trajectories. We further estimate these parameters of interest by approximating a denoising process that progressively recovers pedestrian movements from noise. Unlike previous studies, we translate the predictive stochasticity to explicit distributions, allowing it to readily generate plausible future trajectories indicating individuals' self-uncertainty. Moreover, our framework is compatible with different neural net architectures. We empirically show the performance gains over state-of-the-art even with lighter backbones, across most scenes on two public benchmarks.

Uncertainty-Aware Pedestrian Trajectory Prediction via Distributional Diffusion

TL;DR

This work tackles the inherent uncertainty and multi-modality in pedestrian trajectory prediction by introducing UPDD, a distributional diffusion framework that explicitly models predictive uncertainty via bi-variate Gaussian statistics over future locations. By mapping futures to sufficient statistics and applying guided diffusion conditioned on historical and social cues, UPDD separates self-uncertainty from multi-agent dynamics and enables efficient multi-modal sampling through a non-Markov forward process and accelerated deterministic reverse. The approach yields state-of-the-art or competitive results on ETH/UCY and SDD benchmarks, with substantial gains in sampling efficiency due to explicit density modeling and diffusion-chain shortening. This has practical impact for real-time, uncertainty-aware planning in autonomous systems and sets the stage for broader diffusion-based sequence forecasting with explicit predictive densities.

Abstract

Tremendous efforts have been put forth on predicting pedestrian trajectory with generative models to accommodate uncertainty and multi-modality in human behaviors. An individual's inherent uncertainty, e.g., change of destination, can be masked by complex patterns resulting from the movements of interacting pedestrians. However, latent variable-based generative models often entangle such uncertainty with complexity, leading to limited either latent expressivity or predictive diversity. In this work, we propose to separately model these two factors by implicitly deriving a flexible latent representation to capture intricate pedestrian movements, while integrating predictive uncertainty of individuals with explicit bivariate Gaussian mixture densities over their future locations. More specifically, we present a model-agnostic uncertainty-aware pedestrian trajectory prediction framework, parameterizing sufficient statistics for the mixture of Gaussians that jointly comprise the multi-modal trajectories. We further estimate these parameters of interest by approximating a denoising process that progressively recovers pedestrian movements from noise. Unlike previous studies, we translate the predictive stochasticity to explicit distributions, allowing it to readily generate plausible future trajectories indicating individuals' self-uncertainty. Moreover, our framework is compatible with different neural net architectures. We empirically show the performance gains over state-of-the-art even with lighter backbones, across most scenes on two public benchmarks.
Paper Structure (34 sections, 31 equations, 7 figures, 5 tables)

This paper contains 34 sections, 31 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustrative between VDMs and UPDD. While VDMs rely on random noises throughout diffusion to introduce stochasticity, UPDD inherently includes predictive uncertainty by modeling distributions over trajectories.
  • Figure 2: The overview of UPDD, composed of guidance extractor, distribution converter and distributional diffusion. To account for multi-modal human movements, UPDD approximates the distributions of future trajectories under a diffusion model-based generative framework. We encode historic and neighboring effects for each pedestrian to guide conditional diffusion; we map future coordinates into sufficient statistics of a parametric density estimation, enabling fast sampling of future trajectories therefrom; we also speed up the generation with a non-Markovian "diffusion" chain by skipping certain steps.
  • Figure 3: Illustrative demonstrations of Left: the history encoder $\phi(\cdot)$ (as well as part of neighbor encoder $\psi(\cdot)$ implemented with paralleled single-layered CNNs and self-attention. Right: how guidance information $\mathbf{X}$ is conditioned to the reverse diffusion in each step.
  • Figure 4: Qualitative Visualization
  • Figure 5: Visualizations of Predictive distribution of UPDD (100/200).
  • ...and 2 more figures