Table of Contents
Fetching ...

SatDiffMoE: A Mixture of Estimation Method for Satellite Image Super-resolution with Latent Diffusion Models

Zhaoxu Luo, Bowen Song, Liyue Shen

TL;DR

A novel diffusion-based fusion algorithm called SatDiffMoE that can take an arbitrary number of sequential low-resolution satellite images at the same location as inputs, and fuse them into one high-resolution reconstructed image with more fine details by leveraging and fusing the complementary information from different time points is proposed.

Abstract

During the acquisition of satellite images, there is generally a trade-off between spatial resolution and temporal resolution (acquisition frequency) due to the onboard sensors of satellite imaging systems. High-resolution satellite images are very important for land crop monitoring, urban planning, wildfire management and a variety of applications. It is a significant yet challenging task to achieve high spatial-temporal resolution in satellite imaging. With the advent of diffusion models, we can now learn strong generative priors to generate realistic satellite images with high resolution, which can be utilized to promote the super-resolution task as well. In this work, we propose a novel diffusion-based fusion algorithm called \textbf{SatDiffMoE} that can take an arbitrary number of sequential low-resolution satellite images at the same location as inputs, and fuse them into one high-resolution reconstructed image with more fine details, by leveraging and fusing the complementary information from different time points. Our algorithm is highly flexible and allows training and inference on arbitrary number of low-resolution images. Experimental results show that our proposed SatDiffMoE method not only achieves superior performance for the satellite image super-resolution tasks on a variety of datasets, but also gets an improved computational efficiency with reduced model parameters, compared with previous methods.

SatDiffMoE: A Mixture of Estimation Method for Satellite Image Super-resolution with Latent Diffusion Models

TL;DR

A novel diffusion-based fusion algorithm called SatDiffMoE that can take an arbitrary number of sequential low-resolution satellite images at the same location as inputs, and fuse them into one high-resolution reconstructed image with more fine details by leveraging and fusing the complementary information from different time points is proposed.

Abstract

During the acquisition of satellite images, there is generally a trade-off between spatial resolution and temporal resolution (acquisition frequency) due to the onboard sensors of satellite imaging systems. High-resolution satellite images are very important for land crop monitoring, urban planning, wildfire management and a variety of applications. It is a significant yet challenging task to achieve high spatial-temporal resolution in satellite imaging. With the advent of diffusion models, we can now learn strong generative priors to generate realistic satellite images with high resolution, which can be utilized to promote the super-resolution task as well. In this work, we propose a novel diffusion-based fusion algorithm called \textbf{SatDiffMoE} that can take an arbitrary number of sequential low-resolution satellite images at the same location as inputs, and fuse them into one high-resolution reconstructed image with more fine details, by leveraging and fusing the complementary information from different time points. Our algorithm is highly flexible and allows training and inference on arbitrary number of low-resolution images. Experimental results show that our proposed SatDiffMoE method not only achieves superior performance for the satellite image super-resolution tasks on a variety of datasets, but also gets an improved computational efficiency with reduced model parameters, compared with previous methods.
Paper Structure (35 sections, 6 equations, 7 figures, 7 tables)

This paper contains 35 sections, 6 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: An overview of our proposed method SatDiffMoE.
  • Figure 2: The framework of our proposed SatDiffMoE. In the training phase, we train a latent diffusion model for HR (high-resolution) image conditioning on a single LR (low-resolution) image and its relative time difference with the HR image. Then in the inference phase, we fuse the reverse sampling trajectories conditioning on each LR image of the same location. We can randomly select different trajectories for fusion, but output to a single image at the end.
  • Figure 3: (Left) Super-resolution on fMoW dataset. (Right) Super-resolution on WorldStrat dataset.
  • Figure 4: (Left) Ablation study on LPIPS weight $\alpha$. (Right) Ablation study on optimization weight $\lambda$.
  • Figure 5: Super-resolution results on fMoW dataset by SatDiffMoE with and without fusion. We show SatDiffMoE without fusion by conditioning on the first LR image and SatDiffMoE with fusion by fusing the four LR images.
  • ...and 2 more figures