Adaptive Human Trajectory Prediction via Latent Corridors
Neerja Thakkar, Karttikeya Mangalam, Andrea Bajcsy, Jitendra Malik
TL;DR
The paper tackles the problem of adapting pre-trained human trajectory predictors to scene-specific, transient behaviors that arise in deployment. It introduces latent corridors, lightweight image-space prompts that are learned per deployment scene and added to the input heatmaps of a frozen base predictor, enabling data-efficient adaptation with minimal parameter overhead ($<$0.1\%$). The approach yields substantial gains in ADE across MOTSynth ($up to 23.9\%$) and real datasets (MOT/WildTrack up to 16.4\%, EarthCam up to 26.8\%), with additional benefits when combined with per-scene finetuning; the method also extends to architectures beyond YNet (e.g., PECNet-Ours achieving $10.2\%$ ADE improvement). Overall, latent corridors enable on-device, continual adaptation to changing scene context and transient events, improving ground-plane awareness and scene-specific pedestrian behaviors in a data-efficient manner.
Abstract
Human trajectory prediction is typically posed as a zero-shot generalization problem: a predictor is learnt on a dataset of human motion in training scenes, and then deployed on unseen test scenes. While this paradigm has yielded tremendous progress, it fundamentally assumes that trends in human behavior within the deployment scene are constant over time. As such, current prediction models are unable to adapt to scene-specific transient human behaviors, such as crowds temporarily gathering to see buskers, pedestrians hurrying through the rain and avoiding puddles, or a protest breaking out. We formalize the problem of scene-specific adaptive trajectory prediction and propose a new adaptation approach inspired by prompt tuning called latent corridors. By augmenting the input of any pre-trained human trajectory predictor with learnable image prompts, the predictor can improve in the deployment scene by inferring trends from extremely small amounts of new data (e.g., 2 humans observed for 30 seconds). With less than 0.1% additional model parameters, we see up to 23.9% ADE improvement in MOTSynth simulated data and 16.4% ADE in MOT and Wildtrack real pedestrian data. Qualitatively, we observe that latent corridors imbue predictors with an awareness of scene geometry and scene-specific human behaviors that non-adaptive predictors struggle to capture. The project website can be found at https://neerja.me/atp_latent_corridors/.
