Table of Contents
Fetching ...

ITPNet: Towards Instantaneous Trajectory Prediction for Autonomous Driving

Rongqing Li, Changsheng Li, Yuhang Li, Hanjie Li, Yi Chen, Dongchun Ren, Ye Yuan, Guoren Wang

TL;DR

This work proposes a backward forecasting mechanism to reversely predict the latent feature representations of unobserved historical trajectories of the agent based on its two observed locations and then leverage them as complementary information for future trajectory prediction.

Abstract

Trajectory prediction of agents is crucial for the safety of autonomous vehicles, whereas previous approaches usually rely on sufficiently long-observed trajectory to predict the future trajectory of the agents. However, in real-world scenarios, it is not realistic to collect adequate observed locations for moving agents, leading to the collapse of most prediction models. For instance, when a moving car suddenly appears and is very close to an autonomous vehicle because of the obstruction, it is quite necessary for the autonomous vehicle to quickly and accurately predict the future trajectories of the car with limited observed trajectory locations. In light of this, we focus on investigating the task of instantaneous trajectory prediction, i.e., two observed locations are available during inference. To this end, we propose a general and plug-and-play instantaneous trajectory prediction approach, called ITPNet. Specifically, we propose a backward forecasting mechanism to reversely predict the latent feature representations of unobserved historical trajectories of the agent based on its two observed locations and then leverage them as complementary information for future trajectory prediction. Meanwhile, due to the inevitable existence of noise and redundancy in the predicted latent feature representations, we further devise a Noise Redundancy Reduction Former, aiming at to filter out noise and redundancy from unobserved trajectories and integrate the filtered features and observed features into a compact query for future trajectory predictions. In essence, ITPNet can be naturally compatible with existing trajectory prediction models, enabling them to gracefully handle the case of instantaneous trajectory prediction. Extensive experiments on the Argoverse and nuScenes datasets demonstrate ITPNet outperforms the baselines, and its efficacy with different trajectory prediction models.

ITPNet: Towards Instantaneous Trajectory Prediction for Autonomous Driving

TL;DR

This work proposes a backward forecasting mechanism to reversely predict the latent feature representations of unobserved historical trajectories of the agent based on its two observed locations and then leverage them as complementary information for future trajectory prediction.

Abstract

Trajectory prediction of agents is crucial for the safety of autonomous vehicles, whereas previous approaches usually rely on sufficiently long-observed trajectory to predict the future trajectory of the agents. However, in real-world scenarios, it is not realistic to collect adequate observed locations for moving agents, leading to the collapse of most prediction models. For instance, when a moving car suddenly appears and is very close to an autonomous vehicle because of the obstruction, it is quite necessary for the autonomous vehicle to quickly and accurately predict the future trajectories of the car with limited observed trajectory locations. In light of this, we focus on investigating the task of instantaneous trajectory prediction, i.e., two observed locations are available during inference. To this end, we propose a general and plug-and-play instantaneous trajectory prediction approach, called ITPNet. Specifically, we propose a backward forecasting mechanism to reversely predict the latent feature representations of unobserved historical trajectories of the agent based on its two observed locations and then leverage them as complementary information for future trajectory prediction. Meanwhile, due to the inevitable existence of noise and redundancy in the predicted latent feature representations, we further devise a Noise Redundancy Reduction Former, aiming at to filter out noise and redundancy from unobserved trajectories and integrate the filtered features and observed features into a compact query for future trajectory predictions. In essence, ITPNet can be naturally compatible with existing trajectory prediction models, enabling them to gracefully handle the case of instantaneous trajectory prediction. Extensive experiments on the Argoverse and nuScenes datasets demonstrate ITPNet outperforms the baselines, and its efficacy with different trajectory prediction models.

Paper Structure

This paper contains 22 sections, 16 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Results of HiVT zhou2022hivt in terms of minADE@6 and minFDE@6 on the validation set of Argoverse chang_argoverse_2019 with different observed locations as inputs during training and testing. The value in the horizontal axis denotes the number of observed locations. (b) Future predictions (shown in green) when utilizing different lengths of predicted unobserved trajectory locations. The observed trajectories are shown in orange, the predicted unobserved trajectories are shown in brown, the ground-truth unobserved trajectories are shown in blue, and the ground-truth future trajectories are shown in red.
  • Figure 2: Overview of our ITPNet framework. ITPNet mainly consists of two modules: 1) We propose a backward forecasting mechanism that attempts to reconstruct the latent feature representations $\mathbf{V}^{unobs}$ of previous unobserved trajectory locations $\mathbf{X}^{unobs}$ by the two observed trajectories locations $\mathbf{X}^{obs}$. 2) We devise a Noise Redundancy Reduction Former to filter out noise and redundancy in the predicted latent feature representations $\hat{\mathbf{V}}^{unobs}$, and both the resulting filtered features and the observation features $\mathbf{V}^{obs}$ are integrated into a compact query embedding $\mathbf{Q}$. Finally, the query embedding is sent to the decoder to instantaneously predict future trajectories $\{\hat{\mathbf{X}}^k\}.$
  • Figure 3: Structure of Noise Redundancy Reduction Block.
  • Figure 4: Qualitative results of a) HiVT, b) MOE+HiVT, c) Distill+HiVT, d) ITPNet+HiVT on Argoverse. The observed historical trajectories are shown in red, the ground-truth future trajectories are shown in black, and the predicted multi-modal future trajectories are shown in green.
  • Figure 5: Failure case of ITPNet+HiVT on Argoverse. The observed trajectories are shown in red, the ground-truth trajectories are shown in black, and the predicted multi-modal trajectories are shown in green.
  • ...and 1 more figures