MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

Xiaolu Liu; Song Wang; Wentong Li; Ruizi Yang; Junbo Chen; Jianke Zhu

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu

TL;DR

MGMap tackles the challenge of precise online HD map vectorization under subtle and sparse annotations by introducing mask-guided learning. It combines an Enhanced Multi-Level BEV feature extractor with a Mask-Activated Instance (MAI) decoder and a Position-Guided Mask Patch Refinement (PG-MPR) module to perform coarse instance-level and fine-grained point-level localization. The method jointly leverages learned instance masks and binary mask features to highlight informative regions and refine point locations through ROI-based patch features, achieving substantial mAP gains across nuScenes and Argoverse2. The results demonstrate strong robustness and generalization, with notable performance improvements over state-of-the-art approaches and clear ablations confirming the contribution of each component.

Abstract

Currently, high-definition (HD) map construction leans towards a lightweight online generation tendency, which aims to preserve timely and reliable road scene information. However, map elements contain strong shape priors. Subtle and sparse annotations make current detection-based frameworks ambiguous in locating relevant feature scopes and cause the loss of detailed structures in prediction. To alleviate these problems, we propose MGMap, a mask-guided approach that effectively highlights the informative regions and achieves precise map element localization by introducing the learned masks. Specifically, MGMap employs learned masks based on the enhanced multi-scale BEV features from two perspectives. At the instance level, we propose the Mask-activated instance (MAI) decoder, which incorporates global instance and structural information into instance queries by the activation of instance masks. At the point level, a novel position-guided mask patch refinement (PG-MPR) module is designed to refine point locations from a finer-grained perspective, enabling the extraction of point-specific patch information. Compared to the baselines, our proposed MGMap achieves a notable improvement of around 10 mAP for different input modalities. Extensive experiments also demonstrate that our approach showcases strong robustness and generalization capabilities. Our code can be found at https://github.com/xiaolul2/MGMap.

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

TL;DR

Abstract

Paper Structure (23 sections, 14 equations, 11 figures, 9 tables)

This paper contains 23 sections, 14 equations, 11 figures, 9 tables.

Introduction
Related Work
MGMap
BEV Feature Extraction
Mask-Activated Instance Decoder
Position-Guided Mask Patch Refinement
Training Loss
Experiments
Datasets and Benchmarks
Evaluation Metrics
Implementation Details
Main Results
Ablation Study
Conclusion
More Details of Our Method
...and 8 more sections

Figures (11)

Figure 1: For some detailed structures, our proposed MGMap achieves effective map element localization by highlighting the informative regions through the learned masks.
Figure 2: Overview of MGMap framework. MGMap mainly consists of three components: (1) BEV Extractor to obtain multi-scale BEV features by transforming from perspective view (PV) to BEV with the enhanced multi-level neck; (2) Mask-Activated Instance (MAI) Decoder is employed to construct and update queries at instance level; (3) Position-Guided Mask Patch Refinement (PG-MPR) module is designed to refine points' positions from local patch features at point level.
Figure 3: Illustration of mask constructions at different stages. In MAI decoder, instance masks are generated to activate lane queries, while binary masks are extracted to provide fine-grained patch features in PG-MPR.
Figure 4: (a) The conventional deformable attention extracts sparse features from sampling points, which may select irrelevant features; (b) Our proposed Mask Patch Refinement extracts more relevant features from the region of reliable patch.
Figure 5: The visual results of MapTR liao2022maptr, our proposed MGMap approach and the corresponding ground truth.
...and 6 more figures

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

TL;DR

Abstract

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

Authors

TL;DR

Abstract

Table of Contents

Figures (11)