Table of Contents
Fetching ...

Trajectory Prediction for Autonomous Driving using Agent-Interaction Graph Embedding

Jilan Samiuddin, Benoit Boulet, Di Wu

TL;DR

AiGem introduces a heterogeneous agent-interaction graph to forecast surrounding vehicles' trajectories around an ego vehicle. A depthwise graph encoder processes a sequence of spatial graphs linked by temporal edges, followed by a sequential GRU decoder and an MLP output to predict future positions. On NGSIM data, AiGem delivers competitive short-horizon accuracy and superior long-horizon performance while maintaining a lightweight parameter count. The approach demonstrates the value of integrating spatial interactions and temporal continuity in a unified graph representation for real-time autonomous driving decision making.

Abstract

Trajectory prediction module in an autonomous driving system is crucial for the decision-making and safety of the autonomous agent car and its surroundings. This work presents a novel scheme called AiGem (Agent-Interaction Graph Embedding) to predict traffic vehicle trajectories around the autonomous car. AiGem tackles this problem in four steps. First, AiGem formulates the historical traffic interaction with the autonomous agent as a graph in two steps: (1) at each time step of the history frames, agent-interactions are captured using spatial edges between the agents (nodes of the graph), and then, (2) connects the spatial graphs in chronological order using temporal edges. Then, AiGem applies a depthwise graph encoder network on the spatial-temporal graph to generate graph embedding, i.e., embedding of all the nodes in the graph. Next, a sequential Gated Recurrent Unit decoder network uses the embedding of the current timestamp to get the decoded states. Finally, an output network comprising a Multilayer Perceptron is used to predict the trajectories utilizing the decoded states as its inputs. Results show that AiGem outperforms the state-of-the-art deep learning algorithms for longer prediction horizons.

Trajectory Prediction for Autonomous Driving using Agent-Interaction Graph Embedding

TL;DR

AiGem introduces a heterogeneous agent-interaction graph to forecast surrounding vehicles' trajectories around an ego vehicle. A depthwise graph encoder processes a sequence of spatial graphs linked by temporal edges, followed by a sequential GRU decoder and an MLP output to predict future positions. On NGSIM data, AiGem delivers competitive short-horizon accuracy and superior long-horizon performance while maintaining a lightweight parameter count. The approach demonstrates the value of integrating spatial interactions and temporal continuity in a unified graph representation for real-time autonomous driving decision making.

Abstract

Trajectory prediction module in an autonomous driving system is crucial for the decision-making and safety of the autonomous agent car and its surroundings. This work presents a novel scheme called AiGem (Agent-Interaction Graph Embedding) to predict traffic vehicle trajectories around the autonomous car. AiGem tackles this problem in four steps. First, AiGem formulates the historical traffic interaction with the autonomous agent as a graph in two steps: (1) at each time step of the history frames, agent-interactions are captured using spatial edges between the agents (nodes of the graph), and then, (2) connects the spatial graphs in chronological order using temporal edges. Then, AiGem applies a depthwise graph encoder network on the spatial-temporal graph to generate graph embedding, i.e., embedding of all the nodes in the graph. Next, a sequential Gated Recurrent Unit decoder network uses the embedding of the current timestamp to get the decoded states. Finally, an output network comprising a Multilayer Perceptron is used to predict the trajectories utilizing the decoded states as its inputs. Results show that AiGem outperforms the state-of-the-art deep learning algorithms for longer prediction horizons.

Paper Structure

This paper contains 19 sections, 19 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: GRU architecture
  • Figure 2: Proposed network architecture AiGem with $K_{H}=3$ (number of steps in the history including the present), $K_{F}=3$ (number of steps to predict). Modules (GRUs and MLPs) inside the networks with the same color share the same weights. The number within parenthesis represents the output dimension of the module.
  • Figure 3: (a) An example scenario of an ego (in red) surrounded by actors (blue) of which actors 1, 2, and 3 are in its sensing area (gray circle), and, (b) Graph formulation with ego connected to the sensed actors via bidirectional spatial edges.
  • Figure 4: Graph encoder network with $L$ number of layers. GAT modules with the same color share same weights and linear modules with the same color share same weights.
  • Figure 5: Comparison of performance of AiGem for different values of $d_{\mathrm{min}}$
  • ...and 3 more figures