Unified Multimodal Vessel Trajectory Prediction with Explainable Navigation Intention
Rui Zhang, Chao Li, Kezhong Liu, Chen Wang, Bolong Zheng, Hongbo Jiang
TL;DR
DI-MTP tackles the problem of reliable, explainable short-term vessel trajectory prediction in diverse encounter scenarios by decomposing multimodality into sustained and transient intentions. The method combines a sustained-intention tree, built from historical data, with a global transient intention optimization powered by a destination-aware CVAE and non-local attention to ensure coherent, scenario-wide predictions. It achieves significant ADE and FDE improvements over both deterministic and multimodal baselines on real AIS datasets, and offers explicit modal explanations via attention-derived prototypes. The approach enhances safety and interpretability in maritime navigation with practical potential for real-time integration in intelligent maritime systems, while leaving room for future improvements in interpretability and regulatory alignment.
Abstract
Vessel trajectory prediction is fundamental to intelligent maritime systems. Within this domain, short-term prediction of rapid behavioral changes in complex maritime environments has established multimodal trajectory prediction (MTP) as a promising research area. However, existing vessel MTP methods suffer from limited scenario applicability and insufficient explainability. To address these challenges, we propose a unified MTP framework incorporating explainable navigation intentions, which we classify into sustained and transient categories. Our method constructs sustained intention trees from historical trajectories and models dynamic transient intentions using a Conditional Variational Autoencoder (CVAE), while using a non-local attention mechanism to maintain global scenario consistency. Experiments on real Automatic Identification System (AIS) datasets demonstrates our method's broad applicability across diverse scenarios, achieving significant improvements in both ADE and FDE. Furthermore, our method improves explainability by explicitly revealing the navigational intentions underlying each predicted trajectory.
