Reverberation: Learning the Latencies Before Forecasting Trajectories
Conghao Wong, Ziqian Zou, Beihao Xia, Xinge You
TL;DR
The paper tackles explicit modeling of temporal latencies in multi-agent trajectory forecasting by introducing the Reverberation Transform, which learns event-level (R) and variability (G) latency kernels to map past sequence similarities into latency-aware rehearsal representations. The Rev model uses these latent representations to decompose predictions into linear, non-interactive, and social components, producing multiple latency-conditioned trajectories trained with a best-of-K_g L2 loss. Across ETH-UCY, SDD, and nuScenes, Rev achieves competitive accuracy while revealing interpretable latency dynamics, and extensive ablations validate the contribution of each kernel and latency branch. The work provides a principled framework for latency-aware sequential decision-making with potential generalization to broader spatio-temporal domains.
Abstract
Bridging the past to the future, connecting agents both spatially and temporally, lies at the core of the trajectory prediction task. Despite great efforts, it remains challenging to explicitly learn and predict latencies, i.e., response intervals or temporal delays with which agents respond to various trajectory-changing events and adjust their future paths, whether on their own or interactively. Different agents may exhibit distinct latency preferences for noticing, processing, and reacting to a specific trajectory-changing event. The lack of consideration of such latencies may undermine the causal continuity of forecasting systems, leading to implausible or unintended trajectories. Inspired by reverberation in acoustics, we propose a new reverberation transform and the corresponding Reverberation (short for Rev) trajectory prediction model, which predicts both individual latency preferences and their stochastic variations accordingly, by using two explicit and learnable reverberation kernels, enabling latency-conditioned and controllable trajectory prediction of both non-interactive and social latencies. Experiments on multiple datasets, whether pedestrians or vehicles, demonstrate that Rev achieves competitive accuracy while revealing interpretable latency dynamics across agents and scenarios. Qualitative analyses further verify the properties of the reverberation transform, highlighting its potential as a general latency modeling approach.
