Monocular Localization with Semantics Map for Autonomous Vehicles

Jixiang Wan; Xudong Zhang; Shuzhou Dong; Yuwei Zhang; Yuchen Yang; Ruoxi Wu; Ye Jiang; Jijunnan Li; Jinquan Lin; Ming Yang

Monocular Localization with Semantics Map for Autonomous Vehicles

Jixiang Wan, Xudong Zhang, Shuzhou Dong, Yuwei Zhang, Yuchen Yang, Ruoxi Wu, Ye Jiang, Jijunnan Li, Jinquan Lin, Ming Yang

TL;DR

This work tackles robust monocular localization for autonomous driving by leveraging stable semantic cues instead of fragile texture features. It proposes a lightweight two-stage pipeline: offline construction of a global semantic map from LiDAR data and online monocular localization via semantic feature data association, aided by an enhanced IPM that compensates for vehicle-induced orientation changes. The optimization fuses lane markings and pole-like objects with a prior pose using nonlinear least squares, achieving competitive accuracy while dramatically reducing map size compared to dense SLAM baselines. Evaluations on KAIST Urban data and a self-recorded industrial-park dataset demonstrate strong translation and rotation performance and practical real-time operation, highlighting the method’s potential for scalable, low-cost autonomous driving localization. The work advances semantic-map–based localization by combining lightweight segmentation, BEV mapping, and robust feature matching with a global optimization framework.

Abstract

Accurate and robust localization remains a significant challenge for autonomous vehicles. The cost of sensors and limitations in local computational efficiency make it difficult to scale to large commercial applications. Traditional vision-based approaches focus on texture features that are susceptible to changes in lighting, season, perspective, and appearance. Additionally, the large storage size of maps with descriptors and complex optimization processes hinder system performance. To balance efficiency and accuracy, we propose a novel lightweight visual semantic localization algorithm that employs stable semantic features instead of low-level texture features. First, semantic maps are constructed offline by detecting semantic objects, such as ground markers, lane lines, and poles, using cameras or LiDAR sensors. Then, online visual localization is performed through data association of semantic features and map objects. We evaluated our proposed localization framework in the publicly available KAIST Urban dataset and in scenarios recorded by ourselves. The experimental results demonstrate that our method is a reliable and practical localization solution in various autonomous driving localization tasks.

Monocular Localization with Semantics Map for Autonomous Vehicles

TL;DR

Abstract

Paper Structure (13 sections, 14 equations, 8 figures, 3 tables)

This paper contains 13 sections, 14 equations, 8 figures, 3 tables.

INTRODUCTION
RELATED WORKS
Visual Localization
LiDAR SALM
PROPOSED APPROACH
Semantic Map
Image Segmentation
Inverse Perspective Transformation
Optimization Solver
EXPERIMENTAL EVALUATION
Datasets
Visual localization accuracy
CONCLUSIONS

Figures (8)

Figure 1: Illustration of the system structure. The upper part illustrates the construction of global semantic map, and the lower part is the vehicle localization process is through the monocular camera.
Figure 2: Point cloud map generation and BEV segmentation. (a) shows the original point cloud map. (b) is the ground point cloud produced by LiDAR SLAM. (c) provides an example of BEV image, where each pixel corresponds to a 10 cm voxel. (d) displays the OTSU binarization results, which preserves high-contrast features on roads, including lane lines and markers.
Figure 3: Image segmentation. (a) is the raw image captured by front-view camera. (b) is the semantic segmentation result. The orange and gray pixels indicate ground markers and poles, respectively. Green pixels highlight the outline of the ground markers and red pixels indicate the fitted straight lines of the poles. Note that short poles are discarded to avoid bringing in noise.
Figure 4: The schematic of basic IPM model.
Figure 5: The schematic of the enhanced IPM model with roll, pith, and yaw angles compensation.
...and 3 more figures

Monocular Localization with Semantics Map for Autonomous Vehicles

TL;DR

Abstract

Monocular Localization with Semantics Map for Autonomous Vehicles

Authors

TL;DR

Abstract

Table of Contents

Figures (8)