Table of Contents
Fetching ...

InstaGraM: Instance-level Graph Modeling for Vectorized HD Map Learning

Juyeb Shin, Hyeonjun Jeong, Francois Rameau, Dongsuk Kum

TL;DR

This work addresses GPS-independent, scalable HD map localization by introducing InstaGraM, an end-to-end framework that represents vectorized map elements as an instance-level graph and learns connectivity with a graph neural network. It combines BEV-based feature extraction, vertex/edge detectors, and an attentional GNN with differentiable optimal matching to produce accurate, scalable map constructs in real time. Empirical results on nuScenes show InstaGraM significantly improves mAP over state-of-the-art baselines and exhibits strong scalability in long-range configurations, aided by distance-transform embeddings that provide robust geometric priors. The approach reduces reliance on post-processing and autoregressive decoding, offering a practical, efficient solution for real-time autonomous driving map construction, with code available publicly.

Abstract

For scalable autonomous driving, a robust map-based localization system, independent of GPS, is fundamental. To achieve such map-based localization, online high-definition (HD) map construction plays a significant role in accurate estimation of the pose. Although recent advancements in online HD map construction have predominantly investigated on vectorized representation due to its effectiveness, they suffer from computational cost and fixed parametric model, which limit scalability. To alleviate these limitations, we propose a novel HD map learning framework that leverages graph modeling. This framework is designed to learn the construction of diverse geometric shapes, thereby enhancing the scalability of HD map construction. Our approach involves representing the map elements as an instance-level graph by decomposing them into vertices and edges to facilitate accurate and efficient end-to-end vectorized HD map learning. Furthermore, we introduce an association strategy using a Graph Neural Network to efficiently handle the complex geometry of various map elements, while maintaining scalability. Comprehensive experiments on public open dataset show that our proposed network outperforms state-of-the-art model by $1.6$ mAP. We further showcase the superior scalability of our approach compared to state-of-the-art methods, achieving a $4.8$ mAP improvement in long range configuration. Our code is available at https://github.com/juyebshin/InstaGraM.

InstaGraM: Instance-level Graph Modeling for Vectorized HD Map Learning

TL;DR

This work addresses GPS-independent, scalable HD map localization by introducing InstaGraM, an end-to-end framework that represents vectorized map elements as an instance-level graph and learns connectivity with a graph neural network. It combines BEV-based feature extraction, vertex/edge detectors, and an attentional GNN with differentiable optimal matching to produce accurate, scalable map constructs in real time. Empirical results on nuScenes show InstaGraM significantly improves mAP over state-of-the-art baselines and exhibits strong scalability in long-range configurations, aided by distance-transform embeddings that provide robust geometric priors. The approach reduces reliance on post-processing and autoregressive decoding, offering a practical, efficient solution for real-time autonomous driving map construction, with code available publicly.

Abstract

For scalable autonomous driving, a robust map-based localization system, independent of GPS, is fundamental. To achieve such map-based localization, online high-definition (HD) map construction plays a significant role in accurate estimation of the pose. Although recent advancements in online HD map construction have predominantly investigated on vectorized representation due to its effectiveness, they suffer from computational cost and fixed parametric model, which limit scalability. To alleviate these limitations, we propose a novel HD map learning framework that leverages graph modeling. This framework is designed to learn the construction of diverse geometric shapes, thereby enhancing the scalability of HD map construction. Our approach involves representing the map elements as an instance-level graph by decomposing them into vertices and edges to facilitate accurate and efficient end-to-end vectorized HD map learning. Furthermore, we introduce an association strategy using a Graph Neural Network to efficiently handle the complex geometry of various map elements, while maintaining scalability. Comprehensive experiments on public open dataset show that our proposed network outperforms state-of-the-art model by mAP. We further showcase the superior scalability of our approach compared to state-of-the-art methods, achieving a mAP improvement in long range configuration. Our code is available at https://github.com/juyebshin/InstaGraM.
Paper Structure (16 sections, 9 equations, 7 figures, 6 tables)

This paper contains 16 sections, 9 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: We model vectorized map elements as a graph, composed of vertices and edges. (a) is an HD map sample taken from nuScenes. (b) illustrates the vertices of map elements and their bidirectional edges, depicted as forward and backward.
  • Figure 2: We propose InstaGraM, a hybrid architecture of CNNs and a GNN for real-time HD map learning in bird's-eye-view representation. Starting from the input surround images and camera parameters, a unified BEV representation is extracted by projecting and fusing image features. InstaGraM extracts vertex locations and implicit edge maps of map elements, and final vectorized HD map elements are generated throughout a GNN.
  • Figure 3: Proposed InstaGraM architecture. The blocks at the top show the overall components of InstaGraM architecture and the bottom blocks show the details of structure and training of each component.
  • Figure 4: Illustration of graph embedding extraction. Vertex position (top) provides geometric information of vertices, whereas distance transform map (bottom) supports local connectivity between vertices as well as the spatial structure of map elements via weights distributed along the normal direction of connection centered at map elements.
  • Figure 5: Qualitative comparison on complex traffic scenes under various conditions. Beginning from the top row, sun, partly-cloud, cloud and night conditions.
  • ...and 2 more figures