Table of Contents
Fetching ...

ART: Adaptive Relational Transformer for Pedestrian Trajectory Prediction with Temporal-Aware Relations

Ruochen Li, Ziyi Chang, Junyan Hu, Jiannan Li, Amir Atapour-Abarghouei, Hubert P. H. Shum

Abstract

Accurate prediction of real-world pedestrian trajectories is crucial for a wide range of robot-related applications. Recent approaches typically adopt graph-based or transformer-based frameworks to model interactions. Despite their effectiveness, these methods either introduce unnecessary computational overhead or struggle to represent the diverse and time-varying characteristics of human interactions. In this work, we present an Adaptive Relational Transformer (ART), which introduces a Temporal-Aware Relation Graph (TARG) to explicitly capture the evolution of pairwise interactions and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations efficiently. Extensive evaluations on ETH/UCY and NBA benchmarks show that ART delivers state-of-the-art accuracy with high computational efficiency.

ART: Adaptive Relational Transformer for Pedestrian Trajectory Prediction with Temporal-Aware Relations

Abstract

Accurate prediction of real-world pedestrian trajectories is crucial for a wide range of robot-related applications. Recent approaches typically adopt graph-based or transformer-based frameworks to model interactions. Despite their effectiveness, these methods either introduce unnecessary computational overhead or struggle to represent the diverse and time-varying characteristics of human interactions. In this work, we present an Adaptive Relational Transformer (ART), which introduces a Temporal-Aware Relation Graph (TARG) to explicitly capture the evolution of pairwise interactions and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations efficiently. Extensive evaluations on ETH/UCY and NBA benchmarks show that ART delivers state-of-the-art accuracy with high computational efficiency.

Paper Structure

This paper contains 17 sections, 15 equations, 6 figures, 4 tables.

Figures (6)

  • Figure A1: Framework overview. Framework overview. The relation between pedestrians is inferred from the temporal evolution of their pairwise interactions over the observed history.
  • Figure A2: Overview of ART. Left: Temporal-Aware Relation Graph (TARG) leverages pairwise attention to model agent interactions across time steps, assigning higher weights to informative moments. Right: Adaptive Interaction Pruning (AIP) uses top-p filtering to adaptively retain informative neighbors based on cumulative interaction strength, producing a sparsified graph for trajectory prediction.
  • Figure C1: Ablation study of Top-$p$ threshold on the ETH/UCY dataset.
  • Figure C2: Qualitative comparisons with MART lee2024mart on the ETH/UCY dataset. Past trajectories are shown in blue, ground truth in red, and model predictions in green.
  • Figure C3: Qualitative comparisons with MART lee2024mart on the NBA dataset. Past trajectories are shown in blue, ground truth in red, and predictions in green.
  • ...and 1 more figures