Towards Predicting Any Human Trajectory In Context
Ryo Fujii, Hideo Saito, Ryo Hachiuma
TL;DR
TrajICL tackles the challenge of adapting pedestrian trajectory predictors to diverse real-world environments without on-device fine-tuning. It introduces STES to select spatio-temporally similar in-context demonstrations and PG-ES to refine selection using predicted futures, all trained on a large synthetic MOTSynth dataset to boost generalization. The framework is implemented on a Transformer-based predictor with RCPE and SRPE, and uses a two-stage training scheme (VTP and in-context training) with a min-over-K loss. Empirical results show strong in-domain and cross-domain performance, often surpassing fine-tuned baselines, while maintaining suitability for edge devices; however, inference cost and pool quality remain areas for future improvement.
Abstract
Predicting accurate future trajectories of pedestrians is essential for autonomous systems but remains a challenging task due to the need for adaptability in different environments and domains. A common approach involves collecting scenario-specific data and performing fine-tuning via backpropagation. However, the need to fine-tune for each new scenario is often impractical for deployment on edge devices. To address this challenge, we introduce TrajICL, an In-Context Learning (ICL) framework for pedestrian trajectory prediction that enables adaptation without fine-tuning on the scenario-specific data at inference time without requiring weight updates. We propose a spatio-temporal similarity-based example selection (STES) method that selects relevant examples from previously observed trajectories within the same scene by identifying similar motion patterns at corresponding locations. To further refine this selection, we introduce prediction-guided example selection (PG-ES), which selects examples based on both the past trajectory and the predicted future trajectory, rather than relying solely on the past trajectory. This approach allows the model to account for long-term dynamics when selecting examples. Finally, instead of relying on small real-world datasets with limited scenario diversity, we train our model on a large-scale synthetic dataset to enhance its prediction ability by leveraging in-context examples. Extensive experiments demonstrate that TrajICL achieves remarkable adaptation across both in-domain and cross-domain scenarios, outperforming even fine-tuned approaches across multiple public benchmarks. Project Page: https://fujiry0.github.io/TrajICL-project-page/.
