Table of Contents
Fetching ...

Enhancing Online Road Network Perception and Reasoning with Standard Definition Maps

Hengyuan Zhang, David Paz, Yuliang Guo, Arun Das, Xinyu Huang, Karsten Haug, Henrik I. Christensen, Liu Ren

TL;DR

The paper tackles the challenge of online HD map perception for autonomous driving under occlusion and long-range planning by introducing Standard Definition (SD) maps as lightweight priors. It presents two integration pathways: rasterized SD maps fused into BEV for perception and graph-based SD maps coupled with a BEV–SGNN pipeline for joint perception and reasoning. Experiments on OpenLane-V2 and OpenLane-V2-OSM demonstrate faster convergence and up to 30% improvement in centerline mAP, with graph-based SD maps offering parameter-efficient gains. The work contributes a public SD-map–augmented dataset and shows SD priors are robust to moderate localization noise, addressing scalability and maintenance challenges of HD maps in real-world online mapping. The perception range is extended in evaluations to $[-50m, 50m]$ from the typical $[-30m, 30m]$, highlighting the method’s effectiveness in longer-range planning scenarios. The study also reveals that SD-map graphs can reduce model size while preserving or enhancing accuracy, suggesting practical benefits for deployment in resource-constrained settings.

Abstract

Autonomous driving for urban and highway driving applications often requires High Definition (HD) maps to generate a navigation plan. Nevertheless, various challenges arise when generating and maintaining HD maps at scale. While recent online mapping methods have started to emerge, their performance especially for longer ranges is limited by heavy occlusion in dynamic environments. With these considerations in mind, our work focuses on leveraging lightweight and scalable priors-Standard Definition (SD) maps-in the development of online vectorized HD map representations. We first examine the integration of prototypical rasterized SD map representations into various online mapping architectures. Furthermore, to identify lightweight strategies, we extend the OpenLane-V2 dataset with OpenStreetMaps and evaluate the benefits of graphical SD map representations. A key finding from designing SD map integration components is that SD map encoders are model agnostic and can be quickly adapted to new architectures that utilize bird's eye view (BEV) encoders. Our results show that making use of SD maps as priors for the online mapping task can significantly speed up convergence and boost the performance of the online centerline perception task by 30% (mAP). Furthermore, we show that the introduction of the SD maps leads to a reduction of the number of parameters in the perception and reasoning task by leveraging SD map graphs while improving the overall performance. Project Page: https://henryzhangzhy.github.io/sdhdmap/.

Enhancing Online Road Network Perception and Reasoning with Standard Definition Maps

TL;DR

The paper tackles the challenge of online HD map perception for autonomous driving under occlusion and long-range planning by introducing Standard Definition (SD) maps as lightweight priors. It presents two integration pathways: rasterized SD maps fused into BEV for perception and graph-based SD maps coupled with a BEV–SGNN pipeline for joint perception and reasoning. Experiments on OpenLane-V2 and OpenLane-V2-OSM demonstrate faster convergence and up to 30% improvement in centerline mAP, with graph-based SD maps offering parameter-efficient gains. The work contributes a public SD-map–augmented dataset and shows SD priors are robust to moderate localization noise, addressing scalability and maintenance challenges of HD maps in real-world online mapping. The perception range is extended in evaluations to from the typical , highlighting the method’s effectiveness in longer-range planning scenarios. The study also reveals that SD-map graphs can reduce model size while preserving or enhancing accuracy, suggesting practical benefits for deployment in resource-constrained settings.

Abstract

Autonomous driving for urban and highway driving applications often requires High Definition (HD) maps to generate a navigation plan. Nevertheless, various challenges arise when generating and maintaining HD maps at scale. While recent online mapping methods have started to emerge, their performance especially for longer ranges is limited by heavy occlusion in dynamic environments. With these considerations in mind, our work focuses on leveraging lightweight and scalable priors-Standard Definition (SD) maps-in the development of online vectorized HD map representations. We first examine the integration of prototypical rasterized SD map representations into various online mapping architectures. Furthermore, to identify lightweight strategies, we extend the OpenLane-V2 dataset with OpenStreetMaps and evaluate the benefits of graphical SD map representations. A key finding from designing SD map integration components is that SD map encoders are model agnostic and can be quickly adapted to new architectures that utilize bird's eye view (BEV) encoders. Our results show that making use of SD maps as priors for the online mapping task can significantly speed up convergence and boost the performance of the online centerline perception task by 30% (mAP). Furthermore, we show that the introduction of the SD maps leads to a reduction of the number of parameters in the perception and reasoning task by leveraging SD map graphs while improving the overall performance. Project Page: https://henryzhangzhy.github.io/sdhdmap/.
Paper Structure (11 sections, 4 equations, 6 figures, 5 tables)

This paper contains 11 sections, 4 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Online road network perception and reasoning is challenging due to occlusion by on-road objects, especially at long-range as required by planning. In this example, the left turn map elements are heavily occluded by the vehicles. The baseline (TopoNet) using only image data misses the left turn while our method (TopoNet+OSMR--leveraging rasterized Standard Definition (SD) maps as the prior) predicts it correctly. Visualizations represent centerlines with connectivity information.
  • Figure 2: Our pipeline integrating a rasterized SD map with the state-of-the-art online perception approach MapTR. The model encodes the SD Map in rasterized features in bird's-eye view (BEV) space, and fuse them with image BEV features, and predicts centerlines with a deformable attention decoder.
  • Figure 3: Our pipeline integrating graph-based SD maps with the state-of-the-art perception and reasoning architecture based on TopoNet with a BEV-SD OSM graph encoder. The method processes multi-view image data, OSM SD map graphs, and leverages deformable decoders along with a Scene Graph Neural Network process to predict centerlines, traffic elements, and their relationships.
  • Figure 4: Visual comparison between the groundtruth online maps, OSM SD maps, and OpenLane-V2 (OLV2) SD maps. OSM SD maps appear to be more consistent with the groundtruth.
  • Figure 5: Evaluation mAP during training with Chamfer distance. The model with an SD map converges much faster and achieves better performance.
  • ...and 1 more figures