SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening
Yu Zhong, Xiao Wu, Liang-Jian Deng, Zihan Cao
TL;DR
SSDiff rethinks remote sensing pansharpening as a spatial-spectral fusion problem by splitting a diffusion model into dedicated spatial and spectral branches. It introduces an alternating projection fusion module (APFM) that decouples and fuses features across subspaces, and a frequency modulation inter-branch module (FMIM) to balance frequency information between branches; a LoRA-like branch-wise fine-tuning (L-BAF) further refines discriminative features without increasing parameters. Across WorldView-3, WorldView-2, GaoFen-2, and QuickBird datasets, SSDiff achieves state-of-the-art results in both reduced- and full-resolution settings, with strong spectral fidelity and spatial detail preservation and competitive inference efficiency. The approach provides a principled, plug-in fusion mechanism for diffusion-based pansharpening and offers open-source potential to advance practical remote-sensing workflows.
Abstract
Pansharpening is a significant image fusion technique that merges the spatial content and spectral characteristics of remote sensing images to generate high-resolution multispectral images. Recently, denoising diffusion probabilistic models have been gradually applied to visual tasks, enhancing controllable image generation through low-rank adaptation (LoRA). In this paper, we introduce a spatial-spectral integrated diffusion model for the remote sensing pansharpening task, called SSDiff, which considers the pansharpening process as the fusion process of spatial and spectral components from the perspective of subspace decomposition. Specifically, SSDiff utilizes spatial and spectral branches to learn spatial details and spectral features separately, then employs a designed alternating projection fusion module (APFM) to accomplish the fusion. Furthermore, we propose a frequency modulation inter-branch module (FMIM) to modulate the frequency distribution between branches. The two components of SSDiff can perform favorably against the APFM when utilizing a LoRA-like branch-wise alternative fine-tuning method. It refines SSDiff to capture component-discriminating features more sufficiently. Finally, extensive experiments on four commonly used datasets, i.e., WorldView-3, WorldView-2, GaoFen-2, and QuickBird, demonstrate the superiority of SSDiff both visually and quantitatively. The code will be made open source after possible acceptance.
