Table of Contents
Fetching ...

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Yao Liu, Binghao Li, Xianzhi Wang, Claude Sammut, Lina Yao

TL;DR

This work proposes Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction, which combines Graph Convolutional Networks and Transformer Networks by generating stable resolution pseudo-images from Spatio-temporal graphs through a designed stacking and interception method.

Abstract

Trajectory prediction is fundamental to various intelligent technologies, such as autonomous driving and robotics. The motion prediction of pedestrians and vehicles helps emergency braking, reduces collisions, and improves traffic safety. Current trajectory prediction research faces problems of complex social interactions, high dynamics and multi-modality. Especially, it still has limitations in long-time prediction. We propose Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction. We combine Graph Convolutional Networks and Transformer Networks by generating stable resolution pseudo-images from Spatio-temporal graphs through a designed stacking and interception method. Furthermore, we design the attention-aware module to handle social interaction information in scenarios involving mixed pedestrian-vehicle traffic. Thus, we maintain the advantages of the Graph and Transformer, i.e., the ability to aggregate information over an arbitrary number of neighbors and the ability to perform complex time-dependent data processing. We conduct experiments on datasets involving pedestrian, vehicle, and mixed trajectories, respectively. Our results demonstrate that our model minimizes displacement errors across various metrics and significantly reduces the likelihood of collisions. It is worth noting that our model effectively reduces the final displacement error, illustrating the ability of our model to predict for a long time.

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

TL;DR

This work proposes Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction, which combines Graph Convolutional Networks and Transformer Networks by generating stable resolution pseudo-images from Spatio-temporal graphs through a designed stacking and interception method.

Abstract

Trajectory prediction is fundamental to various intelligent technologies, such as autonomous driving and robotics. The motion prediction of pedestrians and vehicles helps emergency braking, reduces collisions, and improves traffic safety. Current trajectory prediction research faces problems of complex social interactions, high dynamics and multi-modality. Especially, it still has limitations in long-time prediction. We propose Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction. We combine Graph Convolutional Networks and Transformer Networks by generating stable resolution pseudo-images from Spatio-temporal graphs through a designed stacking and interception method. Furthermore, we design the attention-aware module to handle social interaction information in scenarios involving mixed pedestrian-vehicle traffic. Thus, we maintain the advantages of the Graph and Transformer, i.e., the ability to aggregate information over an arbitrary number of neighbors and the ability to perform complex time-dependent data processing. We conduct experiments on datasets involving pedestrian, vehicle, and mixed trajectories, respectively. Our results demonstrate that our model minimizes displacement errors across various metrics and significantly reduces the likelihood of collisions. It is worth noting that our model effectively reduces the final displacement error, illustrating the ability of our model to predict for a long time.
Paper Structure (27 sections, 33 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 27 sections, 33 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: An example of pedestrian motion trajectory in the real scenario. Pedestrian trajectory prediction needs to consider social interaction information, and pedestrian trajectories are dynamic and multi-modal.
  • Figure 2: The overview of our Attention-aware Social Graph Transformer Networks. It mainly consists of Social Interaction Spatio-temporal Graph, Spatio-temporal Graph Convolutional Network with pseudo-image and Attention-aware module, Transformer Temporal Extrapolator, and multi-modal prediction.
  • Figure 3: Social Interaction Spatio-temporal Graph.
  • Figure 4: Pseudo-image. The node set (a) and adjacency matrix set (b) of the Spatio-temporal graph are stacked and intercepted in a certain way to obtain constant resolution node set pseudo-images and adjacency matrix pseudo-images.
  • Figure 5: Attention-aware adjacency matrix. When drivers switch roads in their cars, their attention is focused on the vehicles in front and behind them on this side and less on the vehicles on the other side.
  • ...and 4 more figures