Table of Contents
Fetching ...

Towards 3D Scene Understanding of Gas Plumes in LWIR Hyperspectral Images Using Neural Radiance Fields

Scout Jarman, Zigfried Hampel-Arias, Adra Carr, Kevin R. Moon

TL;DR

This method, built on the standard Mip-NeRF architecture, combines state-of-the-art methods for hyperspectral NeRFs and sparse-view NeRFs, along with a novel adaptive weighted MSE loss.

Abstract

Hyperspectral images (HSI) have many applications, ranging from environmental monitoring to national security, and can be used for material detection and identification. Longwave infrared (LWIR) HSI can be used for gas plume detection and analysis. Oftentimes, only a few images of a scene of interest are available and are analyzed individually. The ability to combine information from multiple images into a single, cohesive representation could enhance analysis by providing more context on the scene's geometry and spectral properties. Neural radiance fields (NeRFs) create a latent neural representation of volumetric scene properties that enable novel-view rendering and geometry reconstruction, offering a promising avenue for hyperspectral 3D scene reconstruction. We explore the possibility of using NeRFs to create 3D scene reconstructions from LWIR HSI and demonstrate that the model can be used for the basic downstream analysis task of gas plume detection. The physics-based DIRSIG software suite was used to generate a synthetic multi-view LWIR HSI dataset of a simple facility with a strong sulfur hexafluoride gas plume. Our method, built on the standard Mip-NeRF architecture, combines state-of-the-art methods for hyperspectral NeRFs and sparse-view NeRFs, along with a novel adaptive weighted MSE loss. Our final NeRF method requires around 50% fewer training images than the standard Mip-NeRF and achieves an average PSNR of 39.8 dB with as few as 30 training images. Gas plume detection applied to NeRF-rendered test images using the adaptive coherence estimator achieves an average AUC of 0.821 when compared with detection masks generated from ground-truth test images.

Towards 3D Scene Understanding of Gas Plumes in LWIR Hyperspectral Images Using Neural Radiance Fields

TL;DR

This method, built on the standard Mip-NeRF architecture, combines state-of-the-art methods for hyperspectral NeRFs and sparse-view NeRFs, along with a novel adaptive weighted MSE loss.

Abstract

Hyperspectral images (HSI) have many applications, ranging from environmental monitoring to national security, and can be used for material detection and identification. Longwave infrared (LWIR) HSI can be used for gas plume detection and analysis. Oftentimes, only a few images of a scene of interest are available and are analyzed individually. The ability to combine information from multiple images into a single, cohesive representation could enhance analysis by providing more context on the scene's geometry and spectral properties. Neural radiance fields (NeRFs) create a latent neural representation of volumetric scene properties that enable novel-view rendering and geometry reconstruction, offering a promising avenue for hyperspectral 3D scene reconstruction. We explore the possibility of using NeRFs to create 3D scene reconstructions from LWIR HSI and demonstrate that the model can be used for the basic downstream analysis task of gas plume detection. The physics-based DIRSIG software suite was used to generate a synthetic multi-view LWIR HSI dataset of a simple facility with a strong sulfur hexafluoride gas plume. Our method, built on the standard Mip-NeRF architecture, combines state-of-the-art methods for hyperspectral NeRFs and sparse-view NeRFs, along with a novel adaptive weighted MSE loss. Our final NeRF method requires around 50% fewer training images than the standard Mip-NeRF and achieves an average PSNR of 39.8 dB with as few as 30 training images. Gas plume detection applied to NeRF-rendered test images using the adaptive coherence estimator achieves an average AUC of 0.821 when compared with detection masks generated from ground-truth test images.
Paper Structure (13 sections, 13 equations, 8 figures, 4 tables)

This paper contains 13 sections, 13 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Example of the gas plume detection process. (Left): False-coloringing HSI using wavelengths 10.4, 8.1, and 8.5 $\mu$m for the red, green, and blue channels. (Middle): ACE detection score map with background $\mu$ and $\Sigma$ estimated using all pixels, and using the SF$_6$ absorption spectrum for $\mathbf{t}$. (Right): Plume mask created by thresholding the ACE scores at $0.6$.
  • Figure 2: Example of the adaptive weighted L2 loss weight calculation. 1. provides example false-colored training images with their associated NeRF renderings. 2. shows the squared channel residuals for each pixel. 3. shows the weighting for each channel after averaging the residuals and scaling to sum to one.
  • Figure 3: (a) The visible light (RGB) coloring of our simulated scene, showing the stack, facility, road, and building. (b) The absorption spectrum of SF$_6$ used in the plume simulation. (c) The hemisphere of images captures the scene. Each image points to the base of the stack.
  • Figure 4: False-color renderings compared to ground truth images. The first row shows the ground truth, the second row shows the renderings from our method, and the third row shows the renderings from Mip-NeRF. The first column is for 20 training images, followed by 40, 50, and 100 training images. The seed that produced the highest geometric average of PSNR/55, SSIM, AUC, TPR, and 1-FPR for each image was used for rendering; these represent the best-case scenario for performance. The PSNR and SSIM are printed above each rendered image. See Video 1 (MP4, 4.7 MB) for "drone path," false-color rendered videos of the eight models used to generate these still images.
  • Figure 5: Plot of Mip-NeRF and our method's image reconstruction and detection performance (slight jitter applied to x coordinates). The lines and error bars show the average model performance, with standard deviation, from Tables \ref{['tab:recon metrics']} and \ref{['tab:detection metrics']}. Each of the five experiments is plotted to show the spread in model performance across different training samples. With the detection metrics AUC and TPR, a single experiment can produce high detection performance. This can be seen with our method on 20 training images, and with Mip-NeRF on 40 training images.
  • ...and 3 more figures