Stream Query Denoising for Vectorized HD Map Construction
Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao
TL;DR
This work tackles the challenge of incorporating temporal information into streaming, vectorized HD-map construction. It introduces Stream Query Denoising (SQD), a training-time strategy that denoises the previous-frame ground truth to simulate stream-query predictions, enabling the model to learn temporal consistency for map elements. SQD comprises normal query denoising for curve perturbations and a dedicated stream denoising pathway with Adaptive Temporal Matching and Dynamic Query Noising, all guided by a joint loss that couples map predictions with denoising predictions. Empirical results on nuScenes and Argoverse2 show that SQD-MapNet surpasses prior streaming approaches across short and long perception ranges, with ablations highlighting the value of ATM and dynamic noise; the method significantly advances robust, temporally coherent HD-map construction for autonomous driving.
Abstract
To enhance perception performance in complex and extensive scenarios within the realm of autonomous driving, there has been a noteworthy focus on temporal modeling, with a particular emphasis on streaming methods. The prevailing trend in streaming models involves the utilization of stream queries for the propagation of temporal information. Despite the prevalence of this approach, the direct application of the streaming paradigm to the construction of vectorized high-definition maps (HD-maps) fails to fully harness the inherent potential of temporal information. This paper introduces the Stream Query Denoising (SQD) strategy as a novel approach for temporal modeling in high-definition map (HD-map) construction. SQD is designed to facilitate the learning of temporal consistency among map elements within the streaming model. The methodology involves denoising the queries that have been perturbed by the addition of noise to the ground-truth information from the preceding frame. This denoising process aims to reconstruct the ground-truth information for the current frame, thereby simulating the prediction process inherent in stream queries. The SQD strategy can be applied to those streaming methods (e.g., StreamMapNet) to enhance the temporal modeling. The proposed SQD-MapNet is the StreamMapNet equipped with SQD. Extensive experiments on nuScenes and Argoverse2 show that our method is remarkably superior to other existing methods across all settings of close range and long range. The code will be available soon.
