Table of Contents
Fetching ...

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping

Zhenhua Xu, Kwan-Yee. K. Wong, Hengshuang Zhao

TL;DR

This work investigates the utilization of inner-instance information for vectorized high-definition mapping through transformers, and proposes a powerful system named InsMapper, which effectively harnesses inner-instance information with three exquisite designs, including hybrid query generation, inner-instance query fusion, and inner-instance feature aggregation.

Abstract

Vectorized high-definition (HD) maps contain detailed information about surrounding road elements, which are crucial for various downstream tasks in modern autonomous vehicles, such as motion planning and vehicle control. Recent works attempt to directly detect the vectorized HD map as a point set prediction task, achieving notable detection performance improvements. However, these methods usually overlook and fail to analyze the important inner-instance correlations between predicted points, impeding further advancements. To address this issue, we investigate the utilization of inner-instance information for vectorized high-definition mapping through transformers, and propose a powerful system named $\textbf{InsMapper}$, which effectively harnesses inner-instance information with three exquisite designs, including hybrid query generation, inner-instance query fusion, and inner-instance feature aggregation. The first two modules can better initialize queries for line detection, while the last one refines predicted line instances. InsMapper is highly adaptable and can be seamlessly modified to align with the most recent HD map detection frameworks. Extensive experimental evaluations are conducted on the challenging NuScenes and Argoverse 2 datasets, where InsMapper surpasses the previous state-of-the-art method, demonstrating its effectiveness and generality. The project page for this work is available at https://tonyxuqaq.github.io/InsMapper/ .

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping

TL;DR

This work investigates the utilization of inner-instance information for vectorized high-definition mapping through transformers, and proposes a powerful system named InsMapper, which effectively harnesses inner-instance information with three exquisite designs, including hybrid query generation, inner-instance query fusion, and inner-instance feature aggregation.

Abstract

Vectorized high-definition (HD) maps contain detailed information about surrounding road elements, which are crucial for various downstream tasks in modern autonomous vehicles, such as motion planning and vehicle control. Recent works attempt to directly detect the vectorized HD map as a point set prediction task, achieving notable detection performance improvements. However, these methods usually overlook and fail to analyze the important inner-instance correlations between predicted points, impeding further advancements. To address this issue, we investigate the utilization of inner-instance information for vectorized high-definition mapping through transformers, and propose a powerful system named , which effectively harnesses inner-instance information with three exquisite designs, including hybrid query generation, inner-instance query fusion, and inner-instance feature aggregation. The first two modules can better initialize queries for line detection, while the last one refines predicted line instances. InsMapper is highly adaptable and can be seamlessly modified to align with the most recent HD map detection frameworks. Extensive experimental evaluations are conducted on the challenging NuScenes and Argoverse 2 datasets, where InsMapper surpasses the previous state-of-the-art method, demonstrating its effectiveness and generality. The project page for this work is available at https://tonyxuqaq.github.io/InsMapper/ .
Paper Structure (15 sections, 3 equations, 10 figures, 8 tables)

This paper contains 15 sections, 3 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Comparison of vectorized HD map detection methods. All solutions are evaluated on the NuScenes validation set. The x-axis represents the mAP results and the y-axis displays topology level correctness according to he2018roadrunner. InsMapper is flexible and adaptable, so it can be seamlessly modified to align with multiple existing frameworks. InsMapper is built on MapTR liao2022maptr. $\dagger$ means InsMapper based on more recent framework PivotNet ding2023pivotnet, while $\ddagger$ indicates InsMapper based on previous SOTA, MapTR-V2 liao2023maptrv2.
  • Figure 2: Pre-processing of the vector map. Pink lines represent edges, orange points indicate vertices, and the blue point is the intersection vertex with a degree greater than two. The intersection is removed to simplify the graph, and each obtained instance is then evenly re-sampled into $n_p$ vertices ($n_p=4$ in this example).
  • Figure 3: Visualization of inner-instance correlations. Green lines represent the inner-instance correlation between the blue point of an instance and other points within the same instance.
  • Figure 4: Overall framework. InsMapper is an end-to-end transformer model with an encoder-decoder structure. The transformer encoder projects perspective-view camera images into a bird's-eye view (BEV). Subsequently, the transformer decoder detects vector instances by predicting point sets. To enhance the utilization of inner-instance information, we introduce the following three components: a hybrid query generation scheme (orange module), an inner-instance query fusion module (yellow module), and an inner-instance feature aggregation module (blue module). The first two modules better initialize queries for detection and the final one refines detected line instances.
  • Figure 5: Query generation schemes. For concise visualization, the number of instances $N_I$ is 2, and the number of points per instance $n_p$ is 3. The hybrid scheme can initialize lines with both good diversity and quality.
  • ...and 5 more figures