TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

Zhiying Song; Lei Yang; Fuxi Wen; Jun Li

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

Zhiying Song, Lei Yang, Fuxi Wen, Jun Li

TL;DR

TraF-Align tackles inter-agent latency in asynchronous V2X cooperative perception by predicting object trajectories from past observations and guiding cross-agent feature interaction along those trajectories. The method introduces a field predictor to generate trajectory fields, an offset generator to produce attention sampling points, and trajectory-aware attention to align and reconstruct current-time features for robust fusion. End-to-end training employs a field loss and an offset loss (with Sinkhorn matching) to supervise trajectory alignment and attention point generation, achieving state-of-the-art results on V2V4Real and DAIR-V2X-Seq under latencies up to $400$ ms. The work enables coherent semantic fusion across frames and agents, improving detection accuracy and latency robustness, with practical impact for real-world asynchronous cooperative perception systems.

Abstract

Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

TL;DR

Abstract

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)