Table of Contents
Fetching ...

Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions

Jiacong Xu, Mingqian Liao, K Ram Prabhakar, Vishal M. Patel

TL;DR

Low-light NeRF methods struggle with noise and ISP clipping, hindering geometry and texture recovery. The paper introduces Thermal-NeRF, which fuses thermal and short-exposure visible raw images with Retinex3D and Multi-Exposure to enable simultaneous visible and thermal view synthesis in dark scenes, and it presents MVTV, the first aligned multi-view thermal-visible dataset. It also demonstrates that these modalities benefit each other and can extend to explicit 3DGS-based, high-speed rendering, achieving improved detail preservation and noise smoothing. The work has practical implications for rescue, surveillance, and multimodal 3D reconstruction in challenging lighting conditions.

Abstract

Neural Radiance Fields (NeRF) accomplishes photo-realistic novel view synthesis by learning the implicit volumetric representation of a scene from multi-view images, which faithfully convey the colorimetric information. However, sensor noises will contaminate low-value pixel signals, and the lossy camera image signal processor will further remove near-zero intensities in extremely dark situations, deteriorating the synthesis performance. Existing approaches reconstruct low-light scenes from raw images but struggle to recover texture and boundary details in dark regions. Additionally, they are unsuitable for high-speed models relying on explicit representations. To address these issues, we present Thermal-NeRF, which takes thermal and visible raw images as inputs, considering the thermal camera is robust to the illumination variation and raw images preserve any possible clues in the dark, to accomplish visible and thermal view synthesis simultaneously. Also, the first multi-view thermal and visible dataset (MVTV) is established to support the research on multimodal NeRF. Thermal-NeRF achieves the best trade-off between detail preservation and noise smoothing and provides better synthesis performance than previous work. Finally, we demonstrate that both modalities are beneficial to each other in 3D reconstruction.

Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions

TL;DR

Low-light NeRF methods struggle with noise and ISP clipping, hindering geometry and texture recovery. The paper introduces Thermal-NeRF, which fuses thermal and short-exposure visible raw images with Retinex3D and Multi-Exposure to enable simultaneous visible and thermal view synthesis in dark scenes, and it presents MVTV, the first aligned multi-view thermal-visible dataset. It also demonstrates that these modalities benefit each other and can extend to explicit 3DGS-based, high-speed rendering, achieving improved detail preservation and noise smoothing. The work has practical implications for rescue, surveillance, and multimodal 3D reconstruction in challenging lighting conditions.

Abstract

Neural Radiance Fields (NeRF) accomplishes photo-realistic novel view synthesis by learning the implicit volumetric representation of a scene from multi-view images, which faithfully convey the colorimetric information. However, sensor noises will contaminate low-value pixel signals, and the lossy camera image signal processor will further remove near-zero intensities in extremely dark situations, deteriorating the synthesis performance. Existing approaches reconstruct low-light scenes from raw images but struggle to recover texture and boundary details in dark regions. Additionally, they are unsuitable for high-speed models relying on explicit representations. To address these issues, we present Thermal-NeRF, which takes thermal and visible raw images as inputs, considering the thermal camera is robust to the illumination variation and raw images preserve any possible clues in the dark, to accomplish visible and thermal view synthesis simultaneously. Also, the first multi-view thermal and visible dataset (MVTV) is established to support the research on multimodal NeRF. Thermal-NeRF achieves the best trade-off between detail preservation and noise smoothing and provides better synthesis performance than previous work. Finally, we demonstrate that both modalities are beneficial to each other in 3D reconstruction.
Paper Structure (26 sections, 21 equations, 13 figures, 2 tables)

This paper contains 26 sections, 21 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Illustration of the synthesis performances of RawNeRF rawnerf and our proposed Thermal-NeRF and its multimodal applications. We implement RawNeRF and Thermal-NeRF on Mip-NeRF mipnerf and iNGP instantngp. The first four columns present the qualitative comparison of raw image rendering (postprocessed) of different methods. With the raw and thermal images generated by Thermal-NeRF, we can adjust the visible environment and highlight the thermal objects or fuse the two modalities by image fusion swinfusion for better visual effects in some applications.
  • Figure 1: Illustrations of the effectiveness of Thermal Enhancement (T) on RawNeRF rawnerf implemented on iNGP instantngp.
  • Figure 2: Pixel intensity distributions of raw images taken by short and long exposures. The noise level in dark regions is very high, and long exposure will alleviate this issue.
  • Figure 2: Visualization of the results with and without the predefined smoothing sign function. Color inconsistency and spot points are observed in the middle images.
  • Figure 3: Overview of the architecture of Thermal-NeRF. $ref$, $ill_j$, and $h$ represent the reflectance, $j$-th illumination, and thermal radiance, respectively. R refers to volume rendering. The model part can be adjusted based on different scene representations.
  • ...and 8 more figures