Table of Contents
Fetching ...

Towards Long Term SLAM on Thermal Imagery

Colin Keil, Aniket Gupta, Pushyami Kaveti, Hanumant Singh

TL;DR

This work tackles the challenge of long-term SLAM in thermal LWIR imagery, where diurnal appearance shifts hinder relocalization and map reuse. It presents a learning-based Gluestick descriptor integrated into a Bag-of-Words place recognition framework and a baseline SLAM pipeline built on MCSLAM, evaluated on a new, diverse LWIR dataset with day–night sequences and ground truth. The results show strong day–night place recognition and competitive SLAM performance, with sub-3 m relocalization error in many cases, highlighting the practicality of all-day autonomy using thermal cameras. The dataset, learned vocabulary, and baseline pipeline provide a valuable resource for robust long-term SLAM in degraded-visibility environments and set the stage for future optimizations and multi-modal extensions.

Abstract

Visual SLAM with thermal imagery, and other low contrast visually degraded environments such as underwater, or in areas dominated by snow and ice, remain a difficult problem for many state of the art (SOTA) algorithms. In addition to challenging front-end data association, thermal imagery presents an additional difficulty for long term relocalization and map reuse. The relative temperatures of objects in thermal imagery change dramatically from day to night. Feature descriptors typically used for relocalization in SLAM are unable to maintain consistency over these diurnal changes. We show that learned feature descriptors can be used within existing Bag of Word based localization schemes to dramatically improve place recognition across large temporal gaps in thermal imagery. In order to demonstrate the effectiveness of our trained vocabulary, we have developed a baseline SLAM system, integrating learned features and matching into a classical SLAM algorithm. Our system demonstrates good local tracking on challenging thermal imagery, and relocalization that overcomes dramatic day to night thermal appearance changes. Our code and datasets are available here: https://github.com/neufieldrobotics/IRSLAM_Baseline

Towards Long Term SLAM on Thermal Imagery

TL;DR

This work tackles the challenge of long-term SLAM in thermal LWIR imagery, where diurnal appearance shifts hinder relocalization and map reuse. It presents a learning-based Gluestick descriptor integrated into a Bag-of-Words place recognition framework and a baseline SLAM pipeline built on MCSLAM, evaluated on a new, diverse LWIR dataset with day–night sequences and ground truth. The results show strong day–night place recognition and competitive SLAM performance, with sub-3 m relocalization error in many cases, highlighting the practicality of all-day autonomy using thermal cameras. The dataset, learned vocabulary, and baseline pipeline provide a valuable resource for robust long-term SLAM in degraded-visibility environments and set the stage for future optimizations and multi-modal extensions.

Abstract

Visual SLAM with thermal imagery, and other low contrast visually degraded environments such as underwater, or in areas dominated by snow and ice, remain a difficult problem for many state of the art (SOTA) algorithms. In addition to challenging front-end data association, thermal imagery presents an additional difficulty for long term relocalization and map reuse. The relative temperatures of objects in thermal imagery change dramatically from day to night. Feature descriptors typically used for relocalization in SLAM are unable to maintain consistency over these diurnal changes. We show that learned feature descriptors can be used within existing Bag of Word based localization schemes to dramatically improve place recognition across large temporal gaps in thermal imagery. In order to demonstrate the effectiveness of our trained vocabulary, we have developed a baseline SLAM system, integrating learned features and matching into a classical SLAM algorithm. Our system demonstrates good local tracking on challenging thermal imagery, and relocalization that overcomes dramatic day to night thermal appearance changes. Our code and datasets are available here: https://github.com/neufieldrobotics/IRSLAM_Baseline
Paper Structure (19 sections, 7 figures, 2 tables)

This paper contains 19 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Long Wave Infrared (thermal) imagery poses a significant challenge for place recognition due to dramatic appearance changes over the course of a day. At the top we show a pair of images taken with a static camera approximately 12 hours apart. At the bottom we show matches that are recoverable using the Gluestick feature matching pipeline.
  • Figure 2: Data Collection setup showing two FLIR Boson ADK cameras (green), and the RTK GPS antenna (blue).
  • Figure 3: We show a qualitative example of the number of matched features across images taken during the day (left pair) and then the same scene at night (right pair). At the top we show preprocessed images, below that we show ORB, Sift, SuperPoint and SP+Gluestick. Features are matched with brute force matching and are filtered for geometric consistency using a RANSAC fundamental matrix estimation. SP features with the Gluestick matcher outperform all other methods and are notable better at matching features in the foreground, which is important for parallax in feature based pose estimation.
  • Figure 4: Mean number of matches at each time-step with error of less than three pixels on our timelapse dataset. There are 10 static outdoor scenes with images recorded every 10 minutes over a 24 hour period. For ORB, Sift, and SuperPoint we use a brute force matcher.
  • Figure 5: Here we show our Day trajectory for the KRI dataset, along with the gps ground track. ORBSLAM and ORB-MCSLAM are only able to track for small sections of the map. We show a comparison with Droid SLAM, even though it is not directly relevant to our work because it shows similar tracking performance. There is significant accumulated drift, but scale and qualitative features are correct. The trajectory ends due to a very difficult low texture region. Note that Droid SLAM is able to track longer but we have stopped it at the same point for visual clarity.
  • ...and 2 more figures