psPRF:Pansharpening Planar Neural Radiance Field for Generalized 3D Reconstruction Satellite Imagery
Tongtong Zhang, Yuanxiang Li
TL;DR
psPRF introduces a generalized Planar Neural Radiance Field that fuses low-resolution RGB and high-resolution PAN data within a RPC-aware, multimodal encoder framework. By employing Spectral-to-Spatial Convolution, a depth-embedded MPI decoder, and differentiable RPC reprojection, it achieves joint synthesis of HR-RGB, HR-PAN, and DSM from a single image pair and generalizes across scenes. Experiments on WorldView-3 data show state-of-the-art performance in novel-view synthesis and altitude accuracy, with improved efficiency due to planar rendering. Pan-sharpening emerges as an image-synthesis outcome rather than a separate pre-processing step, enabling practical deployment for satellite 3D reconstruction across varying resolutions and viewpoints.
Abstract
Most current NeRF variants for satellites are designed for one specific scene and fall short of generalization to new geometry. Additionally, the RGB images require pan-sharpening as an independent preprocessing step. This paper introduces psPRF, a Planar Neural Radiance Field designed for paired low-resolution RGB (LR-RGB) and high-resolution panchromatic (HR-PAN) images from satellite sensors with Rational Polynomial Cameras (RPC). To capture the cross-modal prior from both of the LR-RGB and HR-PAN images, for the Unet-shaped architecture, we adapt the encoder with explicit spectral-to-spatial convolution (SSConv) to enhance the multimodal representation ability. To support the generalization ability of psRPF across scenes, we adopt projection loss to ensure strong geometry self-supervision. The proposed method is evaluated with the multi-scene WorldView-3 LR-RGB and HR-PAN pairs, and achieves state-of-the-art performance.
