Table of Contents
Fetching ...

PMM-Net: Single-stage Multi-agent Trajectory Prediction with Patching-based Embedding and Explicit Modal Modulation

Huajian Liu, Wei Dong, Kunpeng Fan, Chao Wang, Yongzhuo Gao

TL;DR

This letter proposes a patching-based temporal feature extraction module and a graph-based social feature extraction module, enabling effective feature extraction and cross-scenario generalization in multi-agent trajectory prediction framework.

Abstract

Analyzing and forecasting trajectories of agents like pedestrians plays a pivotal role for embodied intelligent applications. The inherent indeterminacy of human behavior and complex social interaction among a rich variety of agents make this task more challenging than common time-series forecasting. In this letter, we aim to explore a distinct formulation for multi-agent trajectory prediction framework. Specifically, we proposed a patching-based temporal feature extraction module and a graph-based social feature extraction module, enabling effective feature extraction and cross-scenario generalization. Moreover, we reassess the role of social interaction and present a novel method based on explicit modality modulation to integrate temporal and social features, thereby constructing an efficient single-stage inference pipeline. Results on public benchmark datasets demonstrate the superior performance of our model compared with the state-of-the-art methods. The code is available at: github.com/TIB-K330/pmm-net.

PMM-Net: Single-stage Multi-agent Trajectory Prediction with Patching-based Embedding and Explicit Modal Modulation

TL;DR

This letter proposes a patching-based temporal feature extraction module and a graph-based social feature extraction module, enabling effective feature extraction and cross-scenario generalization in multi-agent trajectory prediction framework.

Abstract

Analyzing and forecasting trajectories of agents like pedestrians plays a pivotal role for embodied intelligent applications. The inherent indeterminacy of human behavior and complex social interaction among a rich variety of agents make this task more challenging than common time-series forecasting. In this letter, we aim to explore a distinct formulation for multi-agent trajectory prediction framework. Specifically, we proposed a patching-based temporal feature extraction module and a graph-based social feature extraction module, enabling effective feature extraction and cross-scenario generalization. Moreover, we reassess the role of social interaction and present a novel method based on explicit modality modulation to integrate temporal and social features, thereby constructing an efficient single-stage inference pipeline. Results on public benchmark datasets demonstrate the superior performance of our model compared with the state-of-the-art methods. The code is available at: github.com/TIB-K330/pmm-net.

Paper Structure

This paper contains 16 sections, 11 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Illustration of the patching-based temporal feature extraction module, where the input consists of the observed historical trajectories, and the output is the encoded tokens.
  • Figure 2: Illustration of the graph-based social feature extraction module: a) Graph representation of a specific crowd scenario, where $\mathcal{V}$ denotes the set of all nodes and $\mathcal{E}$ denotes the set of all edges; b) GNN-based social information aggregation.
  • Figure 3: Illustration of the single-stage multi-modal prediction framework.
  • Figure 4: Qualitative results on SDD dataset with predicted trajectories are plotted in yellow, ground-truth plotted in green and observed historical trajectories depicted in blue. Each sub-figure is labeled with the name of its corresponding scene. The sampling interval between any two consecutive points on the same trajectory is 0.4 seconds, and its sparsity reflects variations in movement speed.