Table of Contents
Fetching ...

FastMap: Fast Queries Initialization Based Vectorized HD Map Reconstruction Framework

Haotian Hu, Jingwei Xu, Fanyi Wang, Toyota Li, Yaonong Wang, Laifeng Hu, Zhiwang Zhang

TL;DR

FastMap tackles the efficiency bottleneck of DETR-style decoders in online HD map reconstruction by introducing a single-layer, two-stage transformer with heatmap-guided query initialization. It replaces random query initialization with a heatmap-guided mechanism and adds a geometry-aware point-to-line loss, achieving state-of-the-art mAP on nuScenes and Argoverse2 while significantly accelerating inference. The approach combines coarse-to-fine cross-attention, geometric priors via a circular sampling strategy, and a multi-term loss to better supervise map element geometry without forcing exact point annotations. This yields faster, more accurate online vector maps with strong implications for downstream perception, prediction, and planning in autonomous driving.

Abstract

Reconstruction of high-definition maps is a crucial task in perceiving the autonomous driving environment, as its accuracy directly impacts the reliability of prediction and planning capabilities in downstream modules. Current vectorized map reconstruction methods based on the DETR framework encounter limitations due to the redundancy in the decoder structure, necessitating the stacking of six decoder layers to maintain performance, which significantly hampers computational efficiency. To tackle this issue, we introduce FastMap, an innovative framework designed to reduce decoder redundancy in existing approaches. FastMap optimizes the decoder architecture by employing a single-layer, two-stage transformer that achieves multilevel representation capabilities. Our framework eliminates the conventional practice of randomly initializing queries and instead incorporates a heatmap-guided query generation module during the decoding phase, which effectively maps image features into structured query vectors using learnable positional encoding. Additionally, we propose a geometry-constrained point-to-line loss mechanism for FastMap, which adeptly addresses the challenge of distinguishing highly homogeneous features that often arise in traditional point-to-point loss computations. Extensive experiments demonstrate that FastMap achieves state-of-the-art performance in both nuScenes and Argoverse2 datasets, with its decoder operating 3.2 faster than the baseline. Code and more demos are available at https://github.com/hht1996ok/FastMap.

FastMap: Fast Queries Initialization Based Vectorized HD Map Reconstruction Framework

TL;DR

FastMap tackles the efficiency bottleneck of DETR-style decoders in online HD map reconstruction by introducing a single-layer, two-stage transformer with heatmap-guided query initialization. It replaces random query initialization with a heatmap-guided mechanism and adds a geometry-aware point-to-line loss, achieving state-of-the-art mAP on nuScenes and Argoverse2 while significantly accelerating inference. The approach combines coarse-to-fine cross-attention, geometric priors via a circular sampling strategy, and a multi-term loss to better supervise map element geometry without forcing exact point annotations. This yields faster, more accurate online vector maps with strong implications for downstream perception, prediction, and planning in autonomous driving.

Abstract

Reconstruction of high-definition maps is a crucial task in perceiving the autonomous driving environment, as its accuracy directly impacts the reliability of prediction and planning capabilities in downstream modules. Current vectorized map reconstruction methods based on the DETR framework encounter limitations due to the redundancy in the decoder structure, necessitating the stacking of six decoder layers to maintain performance, which significantly hampers computational efficiency. To tackle this issue, we introduce FastMap, an innovative framework designed to reduce decoder redundancy in existing approaches. FastMap optimizes the decoder architecture by employing a single-layer, two-stage transformer that achieves multilevel representation capabilities. Our framework eliminates the conventional practice of randomly initializing queries and instead incorporates a heatmap-guided query generation module during the decoding phase, which effectively maps image features into structured query vectors using learnable positional encoding. Additionally, we propose a geometry-constrained point-to-line loss mechanism for FastMap, which adeptly addresses the challenge of distinguishing highly homogeneous features that often arise in traditional point-to-point loss computations. Extensive experiments demonstrate that FastMap achieves state-of-the-art performance in both nuScenes and Argoverse2 datasets, with its decoder operating 3.2 faster than the baseline. Code and more demos are available at https://github.com/hht1996ok/FastMap.

Paper Structure

This paper contains 28 sections, 9 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Comparison graphs of mAP and FPS between FastMap and existing models.
  • Figure 2: Overall Framework of FastMap. The Decoder Position illustrates how positional information is propagated in the decoder for queries, keys, and values. TopK represents the operation of identifying the Top-K elements with the largest values. CSM denotes the Circular Sampling Method. HGQG Module denotes heatmap-guided query generation module. Gather refers to the operation of extracting elements based on indices. Norm indicates the normalization operation performed according to spatial length.
  • Figure 3: Heatmap ground truth visualisation before and after inflation.
  • Figure 4: AP rise line graphs for MapTR, MapTRv2, FastMap-tiny, and FastMap-base. x-axis denotes the current epoch and y-axis denotes the corresponding AP.
  • Figure 5: Heatmap ground truth, heatmap, geometric priors and prediction visualization results.
  • ...and 1 more figures