Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction
Yu Liu, Yuexin Zhang, Kunming Li, Yongliang Qiao, Stewart Worrall, You-Fu Li, He Kong
TL;DR
This paper addresses cross-scene variability in pedestrian trajectory prediction by introducing a knowledge-aware graph transformer that models social interactions and temporal motion via spatial-temporal graphs and multi-head attention. It combines a spatial GNN and a temporal GNN with a time-extrapolator CNN, and trains with a hybrid loss that blends maximum likelihood with maximum mean discrepancy to align distributions across datasets. The approach achieves improved ADE/FDE and reduced prediction-robustness variance on ETH/UCY compared with strong baselines, demonstrating better generalization across scenes. The work advances practical trajectory prediction for autonomous systems by explicitly addressing domain heterogeneity and uncertainty through graph-based, attention-driven modeling.
Abstract
Predicting pedestrian motion trajectories is crucial for path planning and motion control of autonomous vehicles. Accurately forecasting crowd trajectories is challenging due to the uncertain nature of human motions in different environments. For training, recent deep learning-based prediction approaches mainly utilize information like trajectory history and interactions between pedestrians, among others. This can limit the prediction performance across various scenarios since the discrepancies between training datasets have not been properly incorporated. To overcome this limitation, this paper proposes a graph transformer structure to improve prediction performance, capturing the differences between the various sites and scenarios contained in the datasets. In particular, a self-attention mechanism and a domain adaption module have been designed to improve the generalization ability of the model. Moreover, an additional metric considering cross-dataset sequences is introduced for training and performance evaluation purposes. The proposed framework is validated and compared against existing methods using popular public datasets, i.e., ETH and UCY. Experimental results demonstrate the improved performance of our proposed scheme.
