Table of Contents
Fetching ...

SatFusion: A Unified Framework for Enhancing Satellite IoT Images via Multi-Temporal and Multi-Source Data Fusion

Yufei Tong, Guanjie Cheng, Peihan Wu, Yicheng Zhu, Kexu Lu, Feiyi Chen, Meng Xi, Junqin Huang, Xueqiang Yan, Junfan Wang, Shuiguang Deng

TL;DR

Sat-IoT imagery suffers from low spatial resolution and cross-sensor misalignment; MISR and pansharpening address parts of this problem but remain decoupled. The authors introduce SatFusion, a three-component framework that jointly fuses multi-temporal LRMS and multi-source Pan images to produce a single HRMS image, guided by a composite loss balancing texture and spectral fidelity. Across WorldStrat, WV3, QB, and GF2, SatFusion yields consistent gains over dedicated MISR and pansharpening baselines and shows robustness under blur, noise, and misregistration. This unified approach leads to higher-quality satellite imagery with reduced redundancy, advancing practical Sat-IoT deployment and downstream analytics.

Abstract

With the rapid advancement of the digital society, the proliferation of satellites in the Satellite Internet of Things (Sat-IoT) has led to the continuous accumulation of large-scale multi-temporal and multi-source images across diverse application scenarios. However, existing methods fail to fully exploit the complementary information embedded in both temporal and source dimensions. For example, Multi-Image Super-Resolution (MISR) enhances reconstruction quality by leveraging temporal complementarity across multiple observations, yet the limited fine-grained texture details in input images constrain its performance. Conversely, pansharpening integrates multi-source images by injecting high-frequency spatial information from panchromatic data, but typically relies on pre-interpolated low-resolution inputs and assumes noise-free alignment, making it highly sensitive to noise and misregistration. To address these issues, we propose SatFusion: A Unified Framework for Enhancing Satellite IoT Images via Multi-Temporal and Multi-Source Data Fusion. Specifically, SatFusion first employs a Multi-Temporal Image Fusion (MTIF) module to achieve deep feature alignment with the panchromatic image. Then, a Multi-Source Image Fusion (MSIF) module injects fine-grained texture information from the panchromatic data. Finally, a Fusion Composition module adaptively integrates the complementary advantages of both modalities while dynamically refining spectral consistency, supervised by a weighted combination of multiple loss functions. Extensive experiments on the WorldStrat, WV3, QB, and GF2 datasets demonstrate that SatFusion significantly improves fusion quality, robustness under challenging conditions, and generalizability to real-world Sat-IoT scenarios. The code is available at: https://github.com/dllgyufei/SatFusion.git.

SatFusion: A Unified Framework for Enhancing Satellite IoT Images via Multi-Temporal and Multi-Source Data Fusion

TL;DR

Sat-IoT imagery suffers from low spatial resolution and cross-sensor misalignment; MISR and pansharpening address parts of this problem but remain decoupled. The authors introduce SatFusion, a three-component framework that jointly fuses multi-temporal LRMS and multi-source Pan images to produce a single HRMS image, guided by a composite loss balancing texture and spectral fidelity. Across WorldStrat, WV3, QB, and GF2, SatFusion yields consistent gains over dedicated MISR and pansharpening baselines and shows robustness under blur, noise, and misregistration. This unified approach leads to higher-quality satellite imagery with reduced redundancy, advancing practical Sat-IoT deployment and downstream analytics.

Abstract

With the rapid advancement of the digital society, the proliferation of satellites in the Satellite Internet of Things (Sat-IoT) has led to the continuous accumulation of large-scale multi-temporal and multi-source images across diverse application scenarios. However, existing methods fail to fully exploit the complementary information embedded in both temporal and source dimensions. For example, Multi-Image Super-Resolution (MISR) enhances reconstruction quality by leveraging temporal complementarity across multiple observations, yet the limited fine-grained texture details in input images constrain its performance. Conversely, pansharpening integrates multi-source images by injecting high-frequency spatial information from panchromatic data, but typically relies on pre-interpolated low-resolution inputs and assumes noise-free alignment, making it highly sensitive to noise and misregistration. To address these issues, we propose SatFusion: A Unified Framework for Enhancing Satellite IoT Images via Multi-Temporal and Multi-Source Data Fusion. Specifically, SatFusion first employs a Multi-Temporal Image Fusion (MTIF) module to achieve deep feature alignment with the panchromatic image. Then, a Multi-Source Image Fusion (MSIF) module injects fine-grained texture information from the panchromatic data. Finally, a Fusion Composition module adaptively integrates the complementary advantages of both modalities while dynamically refining spectral consistency, supervised by a weighted combination of multiple loss functions. Extensive experiments on the WorldStrat, WV3, QB, and GF2 datasets demonstrate that SatFusion significantly improves fusion quality, robustness under challenging conditions, and generalizability to real-world Sat-IoT scenarios. The code is available at: https://github.com/dllgyufei/SatFusion.git.

Paper Structure

This paper contains 23 sections, 21 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Overview of the SatFusion system. LEO satellites transmit low-quality images to ground stations, where SatFusion performs fusion-based reconstruction to enhance image quality and reduce redundancy.
  • Figure 2: Overview of MISR and Pansharpening network general methods. (a) General methods of MISR networks; (b) General methods of Pansharpening networks; (c) Common architecture I of Pansharpening fusion module; (d) Common architecture II of Pansharpening fusion module.
  • Figure 3: Architecture of SatFusion. The framework takes multiple multi-temporal LRMS images and a single HRPAN image as inputs. These are processed sequentially through the multi-temporal image fusion module, the multi-source image fusion module, and the fusion composition module, producing the final HRMS image.
  • Figure 4: (left) Workflow of the conventional Wald protocol for dataset construction; (right) Workflow of the Wald protocol with introduced noise for dataset construction.
  • Figure 5: Quantitative comparison of PSNR for the fused images on the WorldStrat real-world dataset.
  • ...and 4 more figures