Pedestrian Trajectory Prediction Based on Social Interactions Learning With Random Weights
Jiajia Xie, Sheng Zhang, Beihao Xia, Zhu Xiao, Hongbo Jiang, Siwang Zhou, Zheng Qin, Hongyang Chen
TL;DR
Pedestrian trajectory prediction in crowded scenes is challenged by implicit social interactions that are not well captured by fixed edge weights. We propose DTGAN, which extends GANs to graph sequence data with random edge weights to automatically learn interactions and produce multi-modal future trajectories; the Generator uses SPE, GAT with random weights, a Temporal Convolutional Network, and CNN-based decoding, while the Discriminator uses SPE-LSTM-FC for realism scoring. We explore multiple task losses (MSE, Gaussian NLL, Uniform likelihood) in conjunction with the WGAN objective to balance realism and diversity, and demonstrate state-of-the-art ADE/FDE and AMD/AMV on ETH/UCY datasets, with DTGAN-G achieving best distributional metrics. DTGAN also shows robustness to random weight initializations and benefits from ablations that highlight the value of graph-based attention and temporal modeling, suggesting practical impact for safer autonomous navigation in dynamic environments.
Abstract
Pedestrian trajectory prediction is a critical technology in the evolution of self-driving cars toward complete artificial intelligence. Over recent years, focusing on the trajectories of pedestrians to model their social interactions has surged with great interest in more accurate trajectory predictions. However, existing methods for modeling pedestrian social interactions rely on pre-defined rules, struggling to capture non-explicit social interactions. In this work, we propose a novel framework named DTGAN, which extends the application of Generative Adversarial Networks (GANs) to graph sequence data, with the primary objective of automatically capturing implicit social interactions and achieving precise predictions of pedestrian trajectory. DTGAN innovatively incorporates random weights within each graph to eliminate the need for pre-defined interaction rules. We further enhance the performance of DTGAN by exploring diverse task loss functions during adversarial training, which yields improvements of 16.7\% and 39.3\% on metrics ADE and FDE, respectively. The effectiveness and accuracy of our framework are verified on two public datasets. The experimental results show that our proposed DTGAN achieves superior performance and is well able to understand pedestrians' intentions.
