Complementing Onboard Sensors with Satellite Map: A New Perspective for HD Map Construction
Wenjie Gao, Jiawei Fu, Yanqing Shen, Haodong Jing, Shitao Chen, Nanning Zheng
TL;DR
This paper tackles the limitations of onboard sensors in HD map construction, particularly long-range perception and occlusion, by introducing satellite-map fusion as a complementary data source. It presents a hierarchical fusion framework with feature-level masked cross-attention and BEV-level alignment to seamlessly integrate satellite tiles with BEV features produced from onboard sensors, enabling improved HD map semantic segmentation and instance detection. A complementary satellite map dataset for nuScenes is released to facilitate research. Across three baseline methods, the proposed fusion approach delivers substantial gains, especially in long-range scenarios, underscoring the practical value of cloud-based auxiliary maps for autonomous driving tasks.
Abstract
High-definition (HD) maps play a crucial role in autonomous driving systems. Recent methods have attempted to construct HD maps in real-time using vehicle onboard sensors. Due to the inherent limitations of onboard sensors, which include sensitivity to detection range and susceptibility to occlusion by nearby vehicles, the performance of these methods significantly declines in complex scenarios and long-range detection tasks. In this paper, we explore a new perspective that boosts HD map construction through the use of satellite maps to complement onboard sensors. We initially generate the satellite map tiles for each sample in nuScenes and release a complementary dataset for further research. To enable better integration of satellite maps with existing methods, we propose a hierarchical fusion module, which includes feature-level fusion and BEV-level fusion. The feature-level fusion, composed of a mask generator and a masked cross-attention mechanism, is used to refine the features from onboard sensors. The BEV-level fusion mitigates the coordinate differences between features obtained from onboard sensors and satellite maps through an alignment module. The experimental results on the augmented nuScenes showcase the seamless integration of our module into three existing HD map construction methods. The satellite maps and our proposed module notably enhance their performance in both HD map semantic segmentation and instance detection tasks.
