AGMA: Adaptive Gaussian Mixture Anchors for Prior-Guided Multimodal Human Trajectory Forecasting
Chao Li, Rui Zhang, Siyuan Huang, Xian Zhong, Hongbo Jiang
TL;DR
AGMA addresses the core bottleneck in multimodal human trajectory forecasting: misaligned priors. By first extracting batch-specific priors through graph-based clustering and then distilling them into a scene-adaptive global GMM via optimal transport and cross-attention, AGMA explicitly optimizes prior quality rather than relying on fixed or implicitly learned priors. Theoretical analysis links prior-sampler interactions to distribution matching accuracy and demonstrates that high-quality priors are necessary for faithful multimodal predictions. Empirically, AGMA achieves state-of-the-art results on ETH-UCY, SDD, and JRDB, validating the practical impact of explicit prior optimization for autonomous navigation and related AI systems.
Abstract
Human trajectory forecasting requires capturing the multimodal nature of pedestrian behavior. However, existing approaches suffer from prior misalignment. Their learned or fixed priors often fail to capture the full distribution of plausible futures, limiting both prediction accuracy and diversity. We theoretically establish that prediction error is lower-bounded by prior quality, making prior modeling a key performance bottleneck. Guided by this insight, we propose AGMA (Adaptive Gaussian Mixture Anchors), which constructs expressive priors through two stages: extracting diverse behavioral patterns from training data and distilling them into a scene-adaptive global prior for inference. Extensive experiments on ETH-UCY, Stanford Drone, and JRDB datasets demonstrate that AGMA achieves state-of-the-art performance, confirming the critical role of high-quality priors in trajectory forecasting.
