Table of Contents
Fetching ...

Multi-Agent Trajectory Prediction with Difficulty-Guided Feature Enhancement Network

Guipeng Xin, Duanfeng Chu, Liping Lu, Zejian Deng, Yuang Lu, Xigang Wu

TL;DR

DGFNet addresses multi-agent trajectory prediction by exploiting heterogeneous prediction difficulty across agents. It combines dual scene-centric and agent-centric representations with a Difficulty-Guided Decoder and a Future Feature Enhancement step to selectively fuse reliable future information into social interactions, achieving state-of-the-art results on Argoverse 1&2 while preserving real-time performance. Comprehensive ablations validate each module's contribution, and analyses highlight robustness on high-difficulty scenarios. The work advances practical autonomous driving forecasting by balancing accuracy with computational efficiency and suggests future research in human-like reasoning under partial perception.

Abstract

Trajectory prediction is crucial for autonomous driving as it aims to forecast the future movements of traffic participants. Traditional methods usually perform holistic inference on the trajectories of agents, neglecting the differences in prediction difficulty among agents. This paper proposes a novel Difficulty-Guided Feature Enhancement Network (DGFNet), which leverages the prediction difficulty differences among agents for multi-agent trajectory prediction. Firstly, we employ spatio-temporal feature encoding and interaction to capture rich spatio-temporal features. Secondly, a difficulty-guided decoder controls the flow of future trajectories into subsequent modules, obtaining reliable future trajectories. Then, feature interaction and fusion are performed through the future feature interaction module. Finally, the fused agent features are fed into the final predictor to generate the predicted trajectory distributions for multiple participants. Experimental results demonstrate that our DGFNet achieves state-of-the-art performance on the Argoverse 1\&2 motion forecasting benchmarks. Ablation studies further validate the effectiveness of each module. Moreover, compared with SOTA methods, our method balances trajectory prediction accuracy and real-time inference speed.

Multi-Agent Trajectory Prediction with Difficulty-Guided Feature Enhancement Network

TL;DR

DGFNet addresses multi-agent trajectory prediction by exploiting heterogeneous prediction difficulty across agents. It combines dual scene-centric and agent-centric representations with a Difficulty-Guided Decoder and a Future Feature Enhancement step to selectively fuse reliable future information into social interactions, achieving state-of-the-art results on Argoverse 1&2 while preserving real-time performance. Comprehensive ablations validate each module's contribution, and analyses highlight robustness on high-difficulty scenarios. The work advances practical autonomous driving forecasting by balancing accuracy with computational efficiency and suggests future research in human-like reasoning under partial perception.

Abstract

Trajectory prediction is crucial for autonomous driving as it aims to forecast the future movements of traffic participants. Traditional methods usually perform holistic inference on the trajectories of agents, neglecting the differences in prediction difficulty among agents. This paper proposes a novel Difficulty-Guided Feature Enhancement Network (DGFNet), which leverages the prediction difficulty differences among agents for multi-agent trajectory prediction. Firstly, we employ spatio-temporal feature encoding and interaction to capture rich spatio-temporal features. Secondly, a difficulty-guided decoder controls the flow of future trajectories into subsequent modules, obtaining reliable future trajectories. Then, feature interaction and fusion are performed through the future feature interaction module. Finally, the fused agent features are fed into the final predictor to generate the predicted trajectory distributions for multiple participants. Experimental results demonstrate that our DGFNet achieves state-of-the-art performance on the Argoverse 1\&2 motion forecasting benchmarks. Ablation studies further validate the effectiveness of each module. Moreover, compared with SOTA methods, our method balances trajectory prediction accuracy and real-time inference speed.
Paper Structure (12 sections, 13 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 12 sections, 13 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: In contrast to traditional trajectory prediction methods, we have incorporated an intermediate step to obtain reliable future trajectories. The lower part of the figure illustrates the varying prediction difficulty among different vehicles in a sample traffic scenario. The future driving trajectory of the yellow vehicle is relatively easy to predict.
  • Figure 2: For the same scenario, the left side represents the scene-centric approach, which only requires using coordinate points. On the right side, the agent-centric approach necessitates expressing through local coordinate points and pairwise-relative poses.
  • Figure 3: Our Spatio-temporal Feature Extraction includes two sets of independent encoders, which extract features based on scene representations (bottom). Subsequently, the extracted features pass through their respective Feature Interaction modules to obtain interacted actor features. Finally, we obtain the predicted trajectories and their corresponding probabilities through the trajectory decoder with Future Feature Enhancement and Difficulty-Guided Decoder.
  • Figure 4: The x-axis represents the predicted result p-minFDE, and the y-axis represents the model's parameter size. Both values should be minimized for better performance.
  • Figure 5: Quantitative results of DGFNet on the Argoverse 1&2 validation set. The top scenarios correspond to the Argoverse 1 dataset and the bottom scenarios correspond to the Argoverse 2 dataset.