IMPACT: Behavioral Intention-aware Multimodal Trajectory Prediction with Adaptive Context Trimming

Jiawei Sun; Xibin Yue; Jiahui Li; Tianle Shen; Chengran Yuan; Shuo Sun; Sheng Guo; Quanyun Zhou; Marcelo H Ang

IMPACT: Behavioral Intention-aware Multimodal Trajectory Prediction with Adaptive Context Trimming

Jiawei Sun, Xibin Yue, Jiahui Li, Tianle Shen, Chengran Yuan, Shuo Sun, Sheng Guo, Quanyun Zhou, Marcelo H Ang

TL;DR

IMPACT tackles the challenge of predicting both the behavioral intentions and future trajectories of surrounding agents in autonomous driving. It introduces a unified model with a shared context encoder and dual context filters that prune irrelevant agents and map polylines using predicted intents and vectorized occupancy, plus an automatic labeling approach for intentions on large datasets. The method achieves state-of-the-art results on Waymo motion benchmarks, including Marginal and Interactive predictions, and demonstrates real-world viability with a deployment-ready design that reduces computation without sacrificing accuracy. Overall, IMPACT enhances interpretation, efficiency, and robustness of motion prediction, enabling safer and more reliable autonomous planning.

Abstract

While most prior research has focused on improving the precision of multimodal trajectory predictions, the explicit modeling of multimodal behavioral intentions (e.g., yielding, overtaking) remains relatively underexplored. This paper proposes a unified framework that jointly predicts both behavioral intentions and trajectories to enhance prediction accuracy, interpretability, and efficiency. Specifically, we employ a shared context encoder for both intention and trajectory predictions, thereby reducing structural redundancy and information loss. Moreover, we address the lack of ground-truth behavioral intention labels in mainstream datasets (Waymo, Argoverse) by auto-labeling these datasets, thus advancing the community's efforts in this direction. We further introduce a vectorized occupancy prediction module that infers the probability of each map polyline being occupied by the target vehicle's future trajectory. By leveraging these intention and occupancy prediction priors, our method conducts dynamic, modality-dependent pruning of irrelevant agents and map polylines in the decoding stage, effectively reducing computational overhead and mitigating noise from non-critical elements. Our approach ranks first among LiDAR-free methods on the Waymo Motion Dataset and achieves first place on the Waymo Interactive Prediction Dataset. Remarkably, even without model ensembling, our single-model framework improves the soft mean average precision (softmAP) by 10 percent compared to the second-best method in the Waymo Interactive Prediction Leaderboard. Furthermore, the proposed framework has been successfully deployed on real vehicles, demonstrating its practical effectiveness in real-world applications.

IMPACT: Behavioral Intention-aware Multimodal Trajectory Prediction with Adaptive Context Trimming

TL;DR

Abstract

IMPACT: Behavioral Intention-aware Multimodal Trajectory Prediction with Adaptive Context Trimming

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)