Table of Contents
Fetching ...

Scene-Aware Explainable Multimodal Trajectory Prediction

Pei Liu, Haipeng Liu, Xingyu Liu, Yiqun Li, Junlan Chen, Yangfan He, Jun Ma

TL;DR

The paper addresses the need for joint reasoning across interacting agents and explainability in autonomous driving trajectory prediction. It introduces DMTP, a conditional diffusion‑based, multimodal framework with a Shapley‑value–based explainability module that provides global and scene‑level insights into feature importance. Through Waymo Open Motion Dataset experiments, DMTP achieves superior predictive accuracy while revealing interpretable input influences that align with human driving intuition. The work demonstrates both improved performance and practical interpretability, and makes the code openly available for reuse.

Abstract

Advancements in intelligent technologies have significantly improved navigation in complex traffic environments by enhancing environment perception and trajectory prediction for automated vehicles. However, current research often overlooks the joint reasoning of scenario agents and lacks explainability in trajectory prediction models, limiting their practical use in real-world situations. To address this, we introduce the Explainable Conditional Diffusion-based Multimodal Trajectory Prediction (DMTP) model, which is designed to elucidate the environmental factors influencing predictions and reveal the underlying mechanisms. Our model integrates a modified conditional diffusion approach to capture multimodal trajectory patterns and employs a revised Shapley Value model to assess the significance of global and scenario-specific features. Experiments using the Waymo Open Motion Dataset demonstrate that our explainable model excels in identifying critical inputs and significantly outperforms baseline models in accuracy. Moreover, the factors identified align with the human driving experience, underscoring the model's effectiveness in learning accurate predictions. Code is available in our open-source repository: https://github.com/ocean-luna/Explainable-Prediction.

Scene-Aware Explainable Multimodal Trajectory Prediction

TL;DR

The paper addresses the need for joint reasoning across interacting agents and explainability in autonomous driving trajectory prediction. It introduces DMTP, a conditional diffusion‑based, multimodal framework with a Shapley‑value–based explainability module that provides global and scene‑level insights into feature importance. Through Waymo Open Motion Dataset experiments, DMTP achieves superior predictive accuracy while revealing interpretable input influences that align with human driving intuition. The work demonstrates both improved performance and practical interpretability, and makes the code openly available for reuse.

Abstract

Advancements in intelligent technologies have significantly improved navigation in complex traffic environments by enhancing environment perception and trajectory prediction for automated vehicles. However, current research often overlooks the joint reasoning of scenario agents and lacks explainability in trajectory prediction models, limiting their practical use in real-world situations. To address this, we introduce the Explainable Conditional Diffusion-based Multimodal Trajectory Prediction (DMTP) model, which is designed to elucidate the environmental factors influencing predictions and reveal the underlying mechanisms. Our model integrates a modified conditional diffusion approach to capture multimodal trajectory patterns and employs a revised Shapley Value model to assess the significance of global and scenario-specific features. Experiments using the Waymo Open Motion Dataset demonstrate that our explainable model excels in identifying critical inputs and significantly outperforms baseline models in accuracy. Moreover, the factors identified align with the human driving experience, underscoring the model's effectiveness in learning accurate predictions. Code is available in our open-source repository: https://github.com/ocean-luna/Explainable-Prediction.

Paper Structure

This paper contains 22 sections, 6 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Architecture of DMTP.
  • Figure 2: Overview of the conditional diffusion for scenario latent space.
  • Figure 3: Global feature importance (Shapley Value). The red dot represents the mean value and the red dashed line represents the median.
  • Figure 4: The results of the trajectory prediction error under different traffic scenes caused by the traffic sign.
  • Figure 5: Qualitative results of trajectory prediction and feature importance estimation on different scenarios (corresponding to different rows). Different colors are used to denote different agents: red - the predicted agent; blue - neighboring agents. The dotted line represents the historical trajectory, and the solid line represents the future trajectory. In addition, the scene feature importance is highlighted in the heatmap. h, n, s, m represent the feature importance of historical trajectory, neighboring agents, traffic sign, and map respectively.

Theorems & Definitions (6)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6