FusionRF: High-Fidelity Satellite Neural Radiance Fields from Multispectral and Panchromatic Acquisitions
Michael Sprintson, Rama Chellappa, Cheng Peng
TL;DR
FusionRF tackles digital surface reconstruction from satellite imagery by eliminating the need for pansharpening preprocessing. It introduces a satellite NeRF that jointly optimizes on full-channel multispectral and panchromatic inputs using a sparse cross-resolution kernel and multimodal/transient embeddings to intrinsically fuse information and render high-fidelity novel views. The approach yields a 17% average reduction in depth MAE compared to baselines and demonstrates robustness to limited panchromatic data while being adaptable to EO-NeRF-style extensions. This work offers a practical path to accurate DSM reconstruction from commodity satellite datasets without hand-crafted pansharpening, enhancing reliability across domains and sensor conditions.
Abstract
We introduce FusionRF, a novel framework for digital surface reconstruction from satellite multispectral and panchromatic images. Current work has demonstrated the increased accuracy of neural photogrammetry for surface reconstruction from optical satellite images compared to algorithmic methods. Common satellites produce both a panchromatic and multispectral image, which contain high spatial and spectral information respectively. Current neural reconstruction methods require multispectral images to be upsampled with a pansharpening method using the spatial data in the panchromatic image. However, these methods may introduce biases and hallucinations due to domain gaps. FusionRF introduces joint image fusion during optimization through a novel cross-resolution kernel that learns to resolve spatial resolution loss present in multispectral images. As input, FusionRF accepts the original multispectral and panchromatic data, eliminating the need for image preprocessing. FusionRF also leverages multimodal appearance embeddings that encode the image characteristics of each modality and view within a uniform representation. By optimizing on both modalities, FusionRF learns to fuse image modalities while performing reconstruction tasks and eliminates the need for a pansharpening preprocessing step. We evaluate our method on multispectral and panchromatic satellite images from the WorldView-3 satellite in various locations, and show that FusionRF provides an average of 17% reduction in depth reconstruction error, and renders sharp training and novel views.
