Table of Contents
Fetching ...

Reverberation: Learning the Latencies Before Forecasting Trajectories

Conghao Wong, Ziqian Zou, Beihao Xia, Xinge You

TL;DR

The paper tackles explicit modeling of temporal latencies in multi-agent trajectory forecasting by introducing the Reverberation Transform, which learns event-level (R) and variability (G) latency kernels to map past sequence similarities into latency-aware rehearsal representations. The Rev model uses these latent representations to decompose predictions into linear, non-interactive, and social components, producing multiple latency-conditioned trajectories trained with a best-of-K_g L2 loss. Across ETH-UCY, SDD, and nuScenes, Rev achieves competitive accuracy while revealing interpretable latency dynamics, and extensive ablations validate the contribution of each kernel and latency branch. The work provides a principled framework for latency-aware sequential decision-making with potential generalization to broader spatio-temporal domains.

Abstract

Bridging the past to the future, connecting agents both spatially and temporally, lies at the core of the trajectory prediction task. Despite great efforts, it remains challenging to explicitly learn and predict latencies, i.e., response intervals or temporal delays with which agents respond to various trajectory-changing events and adjust their future paths, whether on their own or interactively. Different agents may exhibit distinct latency preferences for noticing, processing, and reacting to a specific trajectory-changing event. The lack of consideration of such latencies may undermine the causal continuity of forecasting systems, leading to implausible or unintended trajectories. Inspired by reverberation in acoustics, we propose a new reverberation transform and the corresponding Reverberation (short for Rev) trajectory prediction model, which predicts both individual latency preferences and their stochastic variations accordingly, by using two explicit and learnable reverberation kernels, enabling latency-conditioned and controllable trajectory prediction of both non-interactive and social latencies. Experiments on multiple datasets, whether pedestrians or vehicles, demonstrate that Rev achieves competitive accuracy while revealing interpretable latency dynamics across agents and scenarios. Qualitative analyses further verify the properties of the reverberation transform, highlighting its potential as a general latency modeling approach.

Reverberation: Learning the Latencies Before Forecasting Trajectories

TL;DR

The paper tackles explicit modeling of temporal latencies in multi-agent trajectory forecasting by introducing the Reverberation Transform, which learns event-level (R) and variability (G) latency kernels to map past sequence similarities into latency-aware rehearsal representations. The Rev model uses these latent representations to decompose predictions into linear, non-interactive, and social components, producing multiple latency-conditioned trajectories trained with a best-of-K_g L2 loss. Across ETH-UCY, SDD, and nuScenes, Rev achieves competitive accuracy while revealing interpretable latency dynamics, and extensive ablations validate the contribution of each kernel and latency branch. The work provides a principled framework for latency-aware sequential decision-making with potential generalization to broader spatio-temporal domains.

Abstract

Bridging the past to the future, connecting agents both spatially and temporally, lies at the core of the trajectory prediction task. Despite great efforts, it remains challenging to explicitly learn and predict latencies, i.e., response intervals or temporal delays with which agents respond to various trajectory-changing events and adjust their future paths, whether on their own or interactively. Different agents may exhibit distinct latency preferences for noticing, processing, and reacting to a specific trajectory-changing event. The lack of consideration of such latencies may undermine the causal continuity of forecasting systems, leading to implausible or unintended trajectories. Inspired by reverberation in acoustics, we propose a new reverberation transform and the corresponding Reverberation (short for Rev) trajectory prediction model, which predicts both individual latency preferences and their stochastic variations accordingly, by using two explicit and learnable reverberation kernels, enabling latency-conditioned and controllable trajectory prediction of both non-interactive and social latencies. Experiments on multiple datasets, whether pedestrians or vehicles, demonstrate that Rev achieves competitive accuracy while revealing interpretable latency dynamics across agents and scenarios. Qualitative analyses further verify the properties of the reverberation transform, highlighting its potential as a general latency modeling approach.

Paper Structure

This paper contains 39 sections, 20 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Motivation illustration. Inspired by the acoustic reverberation that describes how the echo sound decays in the space after reflecting, we model how past trajectory-changing events (like intention changes or interactions) change trajectories as well as their latencies as "echoes from the past".
  • Figure 2: Computation pipelines of the proposed reverberation transform. It aims at learning two reverberation kernels $\mathbf{R}$ and $\mathbf{G}$, thus mapping sequential feature $\mathbf{f}$ from the observation domain into the imagined rehearsal domain.
  • Figure 3: Overview of the proposed Reverberation (short for Rev) trajectory prediction model. It predicts trajectories as two learnable parts for any ego $i$, the non-interactive $\Delta \hat{\mathbf{Y}}^i_\mathrm{non}$ and the social $\Delta \hat{\mathbf{Y}}^i_\mathrm{soc}$, with specific non-interactive latencies and social latencies, added upon the linear prediction $\hat{\mathbf{Y}}^i_\mathrm{lin}$ as reference.
  • Figure 4: Examples of non-interactive reverberation strength $r_\mathrm{non} (t|t_p)~(t_p \in \{1, 2, 3, 4\})$ of kernel $\mathbf{R}_\mathrm{non}$ in the proposed Rev model. ($\{T_h, T_f\} = \{4, 6\}$)
  • Figure 5: Examples of the original non-interactive reverberation strength $r_\mathrm{non} (t|t_p)$ ($t_p \in \{1, 2, 3, 4\}$, in blue boxes) and its $K_g = 20$ altered $\tilde{r}_\mathrm{non}^k (t|t_p)$ ($t_p \in \{1, 2, 3, 4\}$, $k \in \{1, 2, ..., K_g\}$) curves after applying the generating kernel $\mathbf{G}_\mathrm{non}$. $k$ refers to indices of subfigures in grey boxes.
  • ...and 7 more figures