GUIDE-CoT: Goal-driven and User-Informed Dynamic Estimation for Pedestrian Trajectory using Chain-of-Thought
Sungsik Kim, Janghyun Baek, Jinkyu Kim, Jaekoo Lee
TL;DR
GUIDE-CoT addresses the challenge of predicting full pedestrian trajectories by integrating a goal-oriented visual prompt with a chain-of-thought-inspired LLM. The approach uses a visual prompt and a pretrained visual encoder to produce accurate goal cues, then feeds structured reasoning prompts into an LLM to generate trajectories toward those goals, with an added user-guidance mechanism for directional or group-based adjustments. Training is decoupled into a visual-prompt goal predictor and a CoT LLM trajectory generator, achieving state-of-the-art results on ETH/UCY and offering controllable trajectory generation. This multimodal framework enhances interpretability and adaptability of pedestrian trajectory prediction in dynamic urban environments, with public code available for replication.
Abstract
While Large Language Models (LLMs) have recently shown impressive results in reasoning tasks, their application to pedestrian trajectory prediction remains challenging due to two key limitations: insufficient use of visual information and the difficulty of predicting entire trajectories. To address these challenges, we propose Goal-driven and User-Informed Dynamic Estimation for pedestrian trajectory using Chain-of-Thought (GUIDE-CoT). Our approach integrates two innovative modules: (1) a goal-oriented visual prompt, which enhances goal prediction accuracy combining visual prompts with a pretrained visual encoder, and (2) a chain-of-thought (CoT) LLM for trajectory generation, which generates realistic trajectories toward the predicted goal. Moreover, our method introduces controllable trajectory generation, allowing for flexible and user-guided modifications to the predicted paths. Through extensive experiments on the ETH/UCY benchmark datasets, our method achieves state-of-the-art performance, delivering both high accuracy and greater adaptability in pedestrian trajectory prediction. Our code is publicly available at https://github.com/ai-kmu/GUIDE-CoT.
