Table of Contents
Fetching ...

VOLoc: Visual Place Recognition by Querying Compressed Lidar Map

Xudong Cai, Yongcai Wang, Zhe Huang, Yu Shao, Deying Li

TL;DR

VOLoc tackles image-to-Lidar place recognition directly in city-scale compressed Lidar maps by leveraging geometric similarity. It introduces a Geometry-Preserving Compressor (GPC) to build a reversible, compact map database and a Geometric Recovery Module (GRM) that reconstructs a Querying Point Cloud (QPC) from monocular image sequences via Visual Odometry, followed by a shared attention-based descriptor aggregator. A transfer-learning scheme pre-trains the aggregation network on large Lidar data and fine-tunes on QPCs, yielding robust cross-modal descriptors. Experimental results on KITTI-derived data show VOLoc achieves localization accuracy competitive with Lidar-to-Lidar methods while using smaller query and map footprints, enabling memory-efficient, mobile-friendly localization. The work also provides a KITTI-based Visual-to-Lidar dataset to facilitate future research in cross-modal, compressed-map VPR.

Abstract

The availability of city-scale Lidar maps enables the potential of city-scale place recognition using mobile cameras. However, the city-scale Lidar maps generally need to be compressed for storage efficiency, which increases the difficulty of direct visual place recognition in compressed Lidar maps. This paper proposes VOLoc, an accurate and efficient visual place recognition method that exploits geometric similarity to directly query the compressed Lidar map via the real-time captured image sequence. In the offline phase, VOLoc compresses the Lidar maps using a \emph{Geometry-Preserving Compressor} (GPC), in which the compression is reversible, a crucial requirement for the downstream 6DoF pose estimation. In the online phase, VOLoc proposes an online Geometric Recovery Module (GRM), which is composed of online Visual Odometry (VO) and a point cloud optimization module, such that the local scene structure around the camera is online recovered to build the \emph{Querying Point Cloud} (QPC). Then the QPC is compressed by the same GPC, and is aggregated into a global descriptor by an attention-based aggregation module, to query the compressed Lidar map in the vector space. A transfer learning mechanism is also proposed to improve the accuracy and the generality of the aggregation network. Extensive evaluations show that VOLoc provides localization accuracy even better than the Lidar-to-Lidar place recognition, setting up a new record for utilizing the compressed Lidar map by low-end mobile cameras. The code are publicly available at https://github.com/Master-cai/VOLoc.

VOLoc: Visual Place Recognition by Querying Compressed Lidar Map

TL;DR

VOLoc tackles image-to-Lidar place recognition directly in city-scale compressed Lidar maps by leveraging geometric similarity. It introduces a Geometry-Preserving Compressor (GPC) to build a reversible, compact map database and a Geometric Recovery Module (GRM) that reconstructs a Querying Point Cloud (QPC) from monocular image sequences via Visual Odometry, followed by a shared attention-based descriptor aggregator. A transfer-learning scheme pre-trains the aggregation network on large Lidar data and fine-tunes on QPCs, yielding robust cross-modal descriptors. Experimental results on KITTI-derived data show VOLoc achieves localization accuracy competitive with Lidar-to-Lidar methods while using smaller query and map footprints, enabling memory-efficient, mobile-friendly localization. The work also provides a KITTI-based Visual-to-Lidar dataset to facilitate future research in cross-modal, compressed-map VPR.

Abstract

The availability of city-scale Lidar maps enables the potential of city-scale place recognition using mobile cameras. However, the city-scale Lidar maps generally need to be compressed for storage efficiency, which increases the difficulty of direct visual place recognition in compressed Lidar maps. This paper proposes VOLoc, an accurate and efficient visual place recognition method that exploits geometric similarity to directly query the compressed Lidar map via the real-time captured image sequence. In the offline phase, VOLoc compresses the Lidar maps using a \emph{Geometry-Preserving Compressor} (GPC), in which the compression is reversible, a crucial requirement for the downstream 6DoF pose estimation. In the online phase, VOLoc proposes an online Geometric Recovery Module (GRM), which is composed of online Visual Odometry (VO) and a point cloud optimization module, such that the local scene structure around the camera is online recovered to build the \emph{Querying Point Cloud} (QPC). Then the QPC is compressed by the same GPC, and is aggregated into a global descriptor by an attention-based aggregation module, to query the compressed Lidar map in the vector space. A transfer learning mechanism is also proposed to improve the accuracy and the generality of the aggregation network. Extensive evaluations show that VOLoc provides localization accuracy even better than the Lidar-to-Lidar place recognition, setting up a new record for utilizing the compressed Lidar map by low-end mobile cameras. The code are publicly available at https://github.com/Master-cai/VOLoc.
Paper Structure (28 sections, 3 equations, 7 figures, 2 tables)

This paper contains 28 sections, 3 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Location images in compressed Lidar maps
  • Figure 2: Overall framework of VOLoc
  • Figure 3: Architecture for Global Feature Aggregation
  • Figure 4: Average recall@K on the KITTI dataset.
  • Figure 5: Point cloud optimization effects.
  • ...and 2 more figures