Resonance: Learning to Predict Social-Aware Pedestrian Trajectories as Co-Vibrations
Conghao Wong, Ziqian Zou, Beihao Xia, Xinge You
TL;DR
This work tackles the challenge of accurately forecasting social-aware pedestrian trajectories by decoupling randomness into distinct determinants. The authors introduce Resonance (Re), a vibration-inspired model that forecasts trajectories as a superposition of multiple decoupled vibrations: a linear base, a self-sourced vibration for individual intentions, and a social-sourced resonance for interactions. Key contributions include a 3-part prediction framework (linear base plus self-bias and re-bias), spectral representations via Haar transforms and Transformer encoders, and an angle-based resonance gathering mechanism that captures social dynamics. Experimental results across ETH-UCY, SDD, NBA, and nuScenes show competitive or superior ADE/FDE performance compared with state-of-the-art methods, with additional qualitative insights into the interpretability of the predicted biases and social resonance. Overall, Re advances explainable, socially aware trajectory prediction with a decoupled, spectral approach to modeling randomness and interactions, offering practical benefits for navigation and autonomous systems.
Abstract
Learning to forecast trajectories of intelligent agents has caught much more attention recently. However, it remains a challenge to accurately account for agents' intentions and social behaviors when forecasting, and in particular, to simulate the unique randomness within each of those components in an explainable and decoupled way. Inspired by vibration systems and their resonance properties, we propose the Resonance (short for Re) model to encode and forecast pedestrian trajectories in the form of ``co-vibrations''. It decomposes trajectory modifications and randomnesses into multiple vibration portions to simulate agents' reactions to each single cause, and forecasts trajectories as the superposition of these independent vibrations separately. Also, benefiting from such vibrations and their spectral properties, representations of social interactions can be learned by emulating the resonance phenomena, further enhancing its explainability. Experiments on multiple datasets have verified its usefulness both quantitatively and qualitatively.
