SemVecNet: Generalizable Vector Map Generation for Arbitrary Sensor Configurations
Narayanan Elavathur Ranganatha, Hengyuan Zhang, Shashank Venkatramani, Jing-Yan Liao, Henrik I. Christensen
TL;DR
SemVecNet tackles the generalization problem in online vector map generation by introducing a modular pipeline that uses a BEV semantic map as an intermediate representation, decoupling sensor configurations from the final vector map. The semantic mapping stage fuses real-time camera and LiDAR data into an ego-centric BEV semantic grid, which is then vectorized by a MapTRv2-inspired decoder to produce labeled map elements. Cross-dataset experiments demonstrate substantially better transfer than state-of-the-art end-to-end approaches, and real-world campus data validates practical applicability without retraining. The approach reduces the need for extensive labeling and retraining when deploying across platforms with different sensor setups, moving toward sensor-configuration-agnostic autonomous systems.
Abstract
Vector maps are essential in autonomous driving for tasks like localization and planning, yet their creation and maintenance are notably costly. While recent advances in online vector map generation for autonomous vehicles are promising, current models lack adaptability to different sensor configurations. They tend to overfit to specific sensor poses, leading to decreased performance and higher retraining costs. This limitation hampers their practical use in real-world applications. In response to this challenge, we propose a modular pipeline for vector map generation with improved generalization to sensor configurations. The pipeline leverages probabilistic semantic mapping to generate a bird's-eye-view (BEV) semantic map as an intermediate representation. This intermediate representation is then converted to a vector map using the MapTRv2 decoder. By adopting a BEV semantic map robust to different sensor configurations, our proposed approach significantly improves the generalization performance. We evaluate the model on datasets with sensor configurations not used during training. Our evaluation sets includes larger public datasets, and smaller scale private data collected on our platform. Our model generalizes significantly better than the state-of-the-art methods.
