Learning Global Representation from Queries for Vectorized HD Map Construction
Shoumeng Qiu, Xinrun Li, Yang Long, Xiangyang Xue, Varun Ojha, Jian Pu
TL;DR
This work tackles online vectorized HD map construction by identifying a limitation in DETR-like query learning: it typically emphasizes local, instance-level details rather than the global map structure. It introduces MapGR, comprising Global Representation Learning (GRL) to derive a global map embedding from all queries and Global Representation Guidance (GRG) to fuse this global context into each query, enabling holistic and local optimization simultaneously. GRL supervises a rasterized BEV map derived from ground-truth maps, while GRG injects the global embedding into per-query representations for enhanced decoding. Across nuScenes and Argoverse 2, MapGR consistently improves mAP over strong baselines and achieves state-of-the-art results on nuScenes, all with minimal computational overhead, highlighting its practical scalability for online HD map construction.
Abstract
The online construction of vectorized high-definition (HD) maps is a cornerstone of modern autonomous driving systems. State-of-the-art approaches, particularly those based on the DETR framework, formulate this as an instance detection problem. However, their reliance on independent, learnable object queries results in a predominantly local query perspective, neglecting the inherent global representation within HD maps. In this work, we propose \textbf{MapGR} (\textbf{G}lobal \textbf{R}epresentation learning for HD \textbf{Map} construction), an architecture designed to learn and utilize a global representations from queries. Our method introduces two synergistic modules: a Global Representation Learning (GRL) module, which encourages the distribution of all queries to better align with the global map through a carefully designed holistic segmentation task, and a Global Representation Guidance (GRG) module, which endows each individual query with explicit, global-level contextual information to facilitate its optimization. Evaluations on the nuScenes and Argoverse2 datasets validate the efficacy of our approach, demonstrating substantial improvements in mean Average Precision (mAP) compared to leading baselines.
