Table of Contents
Fetching ...

RealTraj: Towards Real-World Pedestrian Trajectory Forecasting

Ryo Fujii, Hideo Saito, Ryo Hachiuma

TL;DR

RealTraj tackles real-world pedestrian trajectory forecasting under perception and annotation constraints by combining self-supervised pretraining on synthetic data with weakly-supervised fine-tuning on real detections. The Det2TrajFormer model forecasts future trajectories directly from past detections, remaining robust to tracking noise. Three self-supervised pretext tasks (unmasking, denoising, and person identity reconstruction) plus an acceleration-regularized fine-tuning objective enable strong performance with limited real data and without person-ID annotations. Across multiple datasets, RealTraj demonstrates robust performance under perception errors and competitive results with full supervision, offering a practical pathway toward real-world deployment.

Abstract

This paper jointly addresses three key limitations in conventional pedestrian trajectory forecasting: pedestrian perception errors, real-world data collection costs, and person ID annotation costs. We propose a novel framework, RealTraj, that enhances the real-world applicability of trajectory forecasting. Our approach includes two training phases -- self-supervised pretraining on synthetic data and weakly-supervised fine-tuning with limited real-world data -- to minimize data collection efforts. To improve robustness to real-world errors, we focus on both model design and training objectives. Specifically, we present Det2TrajFormer, a trajectory forecasting model that remains invariant to tracking noise by using past detections as inputs. Additionally, we pretrain the model using multiple pretext tasks, which enhance robustness and improve forecasting performance based solely on detection data. Unlike previous trajectory forecasting methods, our approach fine-tunes the model using only ground-truth detections, reducing the need for costly person ID annotations. In the experiments, we comprehensively verify the effectiveness of the proposed method against the limitations, and the method outperforms state-of-the-art trajectory forecasting methods on multiple datasets. The code will be released at https://fujiry0.github.io/RealTraj-project-page.

RealTraj: Towards Real-World Pedestrian Trajectory Forecasting

TL;DR

RealTraj tackles real-world pedestrian trajectory forecasting under perception and annotation constraints by combining self-supervised pretraining on synthetic data with weakly-supervised fine-tuning on real detections. The Det2TrajFormer model forecasts future trajectories directly from past detections, remaining robust to tracking noise. Three self-supervised pretext tasks (unmasking, denoising, and person identity reconstruction) plus an acceleration-regularized fine-tuning objective enable strong performance with limited real data and without person-ID annotations. Across multiple datasets, RealTraj demonstrates robust performance under perception errors and competitive results with full supervision, offering a practical pathway toward real-world deployment.

Abstract

This paper jointly addresses three key limitations in conventional pedestrian trajectory forecasting: pedestrian perception errors, real-world data collection costs, and person ID annotation costs. We propose a novel framework, RealTraj, that enhances the real-world applicability of trajectory forecasting. Our approach includes two training phases -- self-supervised pretraining on synthetic data and weakly-supervised fine-tuning with limited real-world data -- to minimize data collection efforts. To improve robustness to real-world errors, we focus on both model design and training objectives. Specifically, we present Det2TrajFormer, a trajectory forecasting model that remains invariant to tracking noise by using past detections as inputs. Additionally, we pretrain the model using multiple pretext tasks, which enhance robustness and improve forecasting performance based solely on detection data. Unlike previous trajectory forecasting methods, our approach fine-tunes the model using only ground-truth detections, reducing the need for costly person ID annotations. In the experiments, we comprehensively verify the effectiveness of the proposed method against the limitations, and the method outperforms state-of-the-art trajectory forecasting methods on multiple datasets. The code will be released at https://fujiry0.github.io/RealTraj-project-page.

Paper Structure

This paper contains 26 sections, 6 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Our paper addresses the three limitations in the existing pedestrian trajectory forecasting task. (Top) Pedestrian perception errors can significantly degrade trajectory forecasting performance. (Middle) Real-world data collection necessitates substantial manual effort. (Bottom) Person ID annotations require extensive manual labor.
  • Figure 2: Our proposed framework consists of two training phases and an inference phase. (1) Self-supervised pretraining on synthetic data using multiple pretext tasks. (2) weakly-supervised fine-tuning on real ground-truth detections. (3) Future trajectory inference based solely on detection inputs.
  • Figure 3: Examples of synthetic trajectories generated for self-supervised pretraining.
  • Figure 4: Comparison of model robustness against various error types on the JRDB dataset, including detection errors (miss-detections and localization errors), tracking errors (identity switches), and their combined impact.
  • Figure 5: Effect of corruption ratio during training and number of synthetic trajectories.
  • ...and 4 more figures