PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction
Nan Peng, Xun Zhou, Mingming Wang, Xiaojun Yang, Songming Chen, Guisong Chen
TL;DR
PrevPredMap introduces a temporal modeling framework for online vectorized HD map construction by encoding previous predictions into queries via a dedicated previous-predictions-based query generator and a dynamic-position-query decoder. A dual-mode training strategy ensures robust performance in both single-frame and temporal modes, supported by an enhanced single-frame baseline with a memory-efficient group-wise one-to-many branch. On nuScenes and Argoverse2, PrevPredMap sets new state-of-the-art results and demonstrates favorable inference speed, with ablation analyses confirming the contributions of the core modules. The work suggests that high-level predictions can serve as compact temporal priors and points toward integrating map priors and longer histories for further gains.
Abstract
Temporal information is crucial for detecting occluded instances. Existing temporal representations have progressed from BEV or PV features to more compact query features. Compared to these aforementioned features, predictions offer the highest level of abstraction, providing explicit information. In the context of online vectorized HD map construction, this unique characteristic of predictions is potentially advantageous for long-term temporal modeling and the integration of map priors. This paper introduces PrevPredMap, a pioneering temporal modeling framework that leverages previous predictions for constructing online vectorized HD maps. We have meticulously crafted two essential modules for PrevPredMap: the previous-predictions-based query generator and the dynamic-position-query decoder. Specifically, the previous-predictions-based query generator is designed to separately encode different types of information from previous predictions, which are then effectively utilized by the dynamic-position-query decoder to generate current predictions. Furthermore, we have developed a dual-mode strategy to ensure PrevPredMap's robust performance across both single-frame and temporal modes. Extensive experiments demonstrate that PrevPredMap achieves state-of-the-art performance on the nuScenes and Argoverse2 datasets. Code will be available at https://github.com/pnnnnnnn/PrevPredMap.
