SLAM in the Dark: Self-Supervised Learning of Pose, Depth and Loop-Closure from Thermal Images
Yangfan Xu, Qu Hao, Lilian Zhang, Jun Mao, Xiaofeng He, Wenqi Wu, Changhao Chen
TL;DR
DarkSLAM addresses the challenge of monocular thermal SLAM in outdoor, low-light conditions by combining self-supervised pose and depth learning with targeted architectural enhancements. It introduces Efficient Channel Attention (ECA) for PoseNet and Dino-ResNet50 with a Selective Kernel Attention (SKA)–based DepthNet, along with a Siamese LoopNet for robust loop-closure detection, all integrated into a pose-graph optimization backend. The framework achieves large-scale localization and dense mapping in complex thermal environments and outperforms prior methods in pose accuracy and loop-closure reliability, with real-time capable performance on a high-end GPU. By reducing reliance on labeled data and improving feature robustness in degraded thermal imagery, DarkSLAM holds practical potential for night-time navigation, search-and-rescue, and autonomous monitoring where visible-light SLAM fails. Future work will target better loop-detection under varying thermal conditions and dynamics handling, as well as porting the system to resource-constrained edge devices.
Abstract
Visual SLAM is essential for mobile robots, drone navigation, and VR/AR, but traditional RGB camera systems struggle in low-light conditions, driving interest in thermal SLAM, which excels in such environments. However, thermal imaging faces challenges like low contrast, high noise, and limited large-scale annotated datasets, restricting the use of deep learning in outdoor scenarios. We present DarkSLAM, a noval deep learning-based monocular thermal SLAM system designed for large-scale localization and reconstruction in complex lighting conditions.Our approach incorporates the Efficient Channel Attention (ECA) mechanism in visual odometry and the Selective Kernel Attention (SKA) mechanism in depth estimation to enhance pose accuracy and mitigate thermal depth degradation. Additionally, the system includes thermal depth-based loop closure detection and pose optimization, ensuring robust performance in low-texture thermal scenes. Extensive outdoor experiments demonstrate that DarkSLAM significantly outperforms existing methods like SC-Sfm-Learner and Shin et al., delivering precise localization and 3D dense mapping even in challenging nighttime environments.
