ThermoNeRF: Joint RGB and Thermal Novel View Synthesis for Building Facades using Multimodal Neural Radiance Fields
Mariam Hassan, Florent Forest, Olga Fink, Malcolm Mielle
TL;DR
ThermoNeRF introduces a multimodal Neural Radiance Field framework that jointly renders RGB and thermal views while preserving temperature fidelity. By sharing a density MLP for geometry and using separate heads for RGB and temperature, it prevents cross-modal leakage and achieves superior temperature reconstruction against baselines using concatenated inputs. The authors also present ThermoScenes, a paired RGB+thermal dataset for 16 scenes, enabling robust evaluation of temperature accuracy and novel-view synthesis. Empirical results show average temperature MAEs of $1.13^\ ext{\circ}C$ (buildings) and $0.41^\ ext{\circ}C$ (other scenes), representing substantial improvements over prior baselines and demonstrating practical potential for building retrofit, energy analysis, and infrastructure inspection. The work advances thermal scene understanding by unifying geometry and temperature under a coherent NeRF framework and provides a public dataset and code to foster further research.
Abstract
Thermal scene reconstruction holds great potential for various applications, such as analyzing building energy consumption and performing non-destructive infrastructure testing. However, existing methods typically require dense scene measurements and often rely on RGB images for 3D geometry reconstruction, projecting thermal information post-reconstruction. This can lead to inconsistencies between the reconstructed geometry and temperature data and their actual values. To address this challenge, we propose ThermoNeRF, a novel multimodal approach based on Neural Radiance Fields that jointly renders new RGB and thermal views of a scene, and ThermoScenes, a dataset of paired RGB+thermal images comprising 8 scenes of building facades and 8 scenes of everyday objects. To address the lack of texture in thermal images, ThermoNeRF uses paired RGB and thermal images to learn scene density, while separate networks estimate color and temperature data. Unlike comparable studies, our focus is on temperature reconstruction and experimental results demonstrate that ThermoNeRF achieves an average mean absolute error of 1.13C and 0.41C for temperature estimation in buildings and other scenes, respectively, representing an improvement of over 50% compared to using concatenated RGB+thermal data as input to a standard NeRF. Code and dataset are available online.
