Table of Contents
Fetching ...

Night-Voyager: Consistent and Efficient Nocturnal Vision-Aided State Estimation in Object Maps

Tianxiao Gao, Mingle Zhao, Chengzhong Xu, Hui Kong

TL;DR

Night-Voyager tackles the problem of reliable localization at night with low-cost cameras by leveraging a lightweight streetlight object map as a stable, non-pixel-level cue. It combines a fast P3P-based initialization, a two-stage cross-modal data association, and a matrix Lie group formulation with a feature-decoupled MSC-InEKF to deliver accurate, consistent, and efficient state estimation across day and night. The approach demonstrates robustness to severe illumination variations, reduces drift through map-based constraints, and maintains real-time performance with compact map storage. This work extends nocturnal vision by showing that prior environmental structure, encoded as object maps, can fundamentally overcome the insufficiency and inconsistency of pixel-level night vision, enabling practical round-the-clock navigation.

Abstract

Accurate and robust state estimation at nighttime is essential for autonomous robotic navigation to achieve nocturnal or round-the-clock tasks. An intuitive question arises: Can low-cost standard cameras be exploited for nocturnal state estimation? Regrettably, most existing visual methods may fail under adverse illumination conditions, even with active lighting or image enhancement. A pivotal insight, however, is that streetlights in most urban scenarios act as stable and salient prior visual cues at night, reminiscent of stars in deep space aiding spacecraft voyage in interstellar navigation. Inspired by this, we propose Night-Voyager, an object-level nocturnal vision-aided state estimation framework that leverages prior object maps and keypoints for versatile localization. We also find that the primary limitation of conventional visual methods under poor lighting conditions stems from the reliance on pixel-level metrics. In contrast, metric-agnostic, non-pixel-level object detection serves as a bridge between pixel-level and object-level spaces, enabling effective propagation and utilization of object map information within the system. Night-Voyager begins with a fast initialization to solve the global localization problem. By employing an effective two-stage cross-modal data association, the system delivers globally consistent state updates using map-based observations. To address the challenge of significant uncertainties in visual observations at night, a novel matrix Lie group formulation and a feature-decoupled multi-state invariant filter are introduced, ensuring consistent and efficient estimation. Through comprehensive experiments in both simulation and diverse real-world scenarios (spanning approximately 12.3 km), Night-Voyager showcases its efficacy, robustness, and efficiency, filling a critical gap in nocturnal vision-aided state estimation.

Night-Voyager: Consistent and Efficient Nocturnal Vision-Aided State Estimation in Object Maps

TL;DR

Night-Voyager tackles the problem of reliable localization at night with low-cost cameras by leveraging a lightweight streetlight object map as a stable, non-pixel-level cue. It combines a fast P3P-based initialization, a two-stage cross-modal data association, and a matrix Lie group formulation with a feature-decoupled MSC-InEKF to deliver accurate, consistent, and efficient state estimation across day and night. The approach demonstrates robustness to severe illumination variations, reduces drift through map-based constraints, and maintains real-time performance with compact map storage. This work extends nocturnal vision by showing that prior environmental structure, encoded as object maps, can fundamentally overcome the insufficiency and inconsistency of pixel-level night vision, enabling practical round-the-clock navigation.

Abstract

Accurate and robust state estimation at nighttime is essential for autonomous robotic navigation to achieve nocturnal or round-the-clock tasks. An intuitive question arises: Can low-cost standard cameras be exploited for nocturnal state estimation? Regrettably, most existing visual methods may fail under adverse illumination conditions, even with active lighting or image enhancement. A pivotal insight, however, is that streetlights in most urban scenarios act as stable and salient prior visual cues at night, reminiscent of stars in deep space aiding spacecraft voyage in interstellar navigation. Inspired by this, we propose Night-Voyager, an object-level nocturnal vision-aided state estimation framework that leverages prior object maps and keypoints for versatile localization. We also find that the primary limitation of conventional visual methods under poor lighting conditions stems from the reliance on pixel-level metrics. In contrast, metric-agnostic, non-pixel-level object detection serves as a bridge between pixel-level and object-level spaces, enabling effective propagation and utilization of object map information within the system. Night-Voyager begins with a fast initialization to solve the global localization problem. By employing an effective two-stage cross-modal data association, the system delivers globally consistent state updates using map-based observations. To address the challenge of significant uncertainties in visual observations at night, a novel matrix Lie group formulation and a feature-decoupled multi-state invariant filter are introduced, ensuring consistent and efficient estimation. Through comprehensive experiments in both simulation and diverse real-world scenarios (spanning approximately 12.3 km), Night-Voyager showcases its efficacy, robustness, and efficiency, filling a critical gap in nocturnal vision-aided state estimation.

Paper Structure

This paper contains 36 sections, 41 equations, 21 figures, 8 tables.

Figures (21)

  • Figure 1: Comparison of different vision-aided state estimation methods. (a), (b), and (c) are camera images captured in nocturnal scenes. (d) displays the online localization (the blue curve) of Night-Voyager within the streetlight map (white boxes) and the matches (red spheres) via the proposed data association approach. (e) depicts the trajectories estimated by the odometer-aided OpenVINS geneva2020openvins (OpenVINS-Odom), the odometer-aided VINS-Mono qin2018vinsliu2019visual (VINS-Odom), and Night-Voyager, respectively. The color bar indicates the trajectory error scale with respect to the ground truth (purple curves).
  • Figure 2: Top row: active lighting can further aggravate inconsistent and imbalanced illumination issues while amplifying the backscatter effect caused by sparkling particles, resulting in a significant number of erroneous feature matches (colored dots and lines). Bottom row: object-level detection remains extraordinarily robust to varying lighting conditions. Even in low-light or completely dark nighttime scenarios, streetlights can consistently serve as stable and salient object-level features for detection (green detection boxes).
  • Figure 3: System overview of the proposed consistent and efficient nocturnal vision-aided state estimation framework, Night-Voyager.
  • Figure 4: Generation process of the streetlight map. The points of each LiDAR scan are projected onto the image and the streetlight points are filtered according to streetlight detections in the image. The final streetlight map is constructed by clustering the streetlight points and storing the poses estimated during the mapping process. With the streetlight map, the LiDAR-inertial SLAM is run again to calculate the virtual center of each streetlight cluster.
  • Figure 5: The projected geometric centers often deviate from streetlight observation centers. The red, blue, and brown crosses are the projected geometric centers, virtual centers, and detection box centers.
  • ...and 16 more figures