Table of Contents
Fetching ...

Diffusion Models Meet Remote Sensing: Principles, Methods, and Perspectives

Yidan Liu, Jun Yue, Shaobo Xia, Pedram Ghamisi, Weiying Xie, Leyuan Fang

TL;DR

The theoretical background of diffusion models is introduced, and the applications of diffusion models in RS are systematically reviewed, including image generation, enhancement, and interpretation.

Abstract

As a newly emerging advance in deep generative models, diffusion models have achieved state-of-the-art results in many fields, including computer vision, natural language processing, and molecule design. The remote sensing (RS) community has also noticed the powerful ability of diffusion models and quickly applied them to a variety of tasks for image processing. Given the rapid increase in research on diffusion models in the field of RS, it is necessary to conduct a comprehensive review of existing diffusion model-based RS papers, to help researchers recognize the potential of diffusion models and provide some directions for further exploration. Specifically, this article first introduces the theoretical background of diffusion models, and then systematically reviews the applications of diffusion models in RS, including image generation, enhancement, and interpretation. Finally, the limitations of existing RS diffusion models and worthy research directions for further exploration are discussed and summarized.

Diffusion Models Meet Remote Sensing: Principles, Methods, and Perspectives

TL;DR

The theoretical background of diffusion models is introduced, and the applications of diffusion models in RS are systematically reviewed, including image generation, enhancement, and interpretation.

Abstract

As a newly emerging advance in deep generative models, diffusion models have achieved state-of-the-art results in many fields, including computer vision, natural language processing, and molecule design. The remote sensing (RS) community has also noticed the powerful ability of diffusion models and quickly applied them to a variety of tasks for image processing. Given the rapid increase in research on diffusion models in the field of RS, it is necessary to conduct a comprehensive review of existing diffusion model-based RS papers, to help researchers recognize the potential of diffusion models and provide some directions for further exploration. Specifically, this article first introduces the theoretical background of diffusion models, and then systematically reviews the applications of diffusion models in RS, including image generation, enhancement, and interpretation. Finally, the limitations of existing RS diffusion models and worthy research directions for further exploration are discussed and summarized.
Paper Structure (32 sections, 13 equations, 15 figures, 5 tables)

This paper contains 32 sections, 13 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: The number of papers on diffusion models for various RS tasks from 2021 to 2024. The data for 2024 is up to the second quarter. From the chart, it is evident that diffusion models firstly dominated the field of RS image generation and quickly expanded to more complex applications. For instance, the number of papers on climate prediction increased from 3 in 2023 to 7 in 2024, more than doubling. Similarly, the diffusion models of RS image super-resolution rose from 10 papers to 17 papers, an increase of 70%.
  • Figure 2: The training procedure of denoising diffusion probabilistic model (DDPM), where yellow lines represent the forward diffusion process, and blue lines represent the backward diffusion process. Specifically, the network is used to fit the distribution $p(x_{t-1} | x_t)$ and output the predicted noise $\epsilon_{\theta}$. Then, minimize the distance between the predicted noise $\epsilon_{\theta}$ and the actual noise $\epsilon$ thereby optimizing the network and completing the training of diffusion model.
  • Figure 3: The sampling process of DDPM. Supposing that sampling begins at T=1000, the noise distribution $\epsilon_{\theta}(y_t, t)$ is obtained from the well-trained diffusion model. Then, the noised image $Y_t$ is used to subtract the noise $\epsilon_{\theta}(y_t, t)$, resulting in a denoised image $Y_{t-1}$. This denoised image $Y_{t-1}$ is then input into the diffusion model to obtain the noise image for the next timestep. This process is repeated until $t = 1$, at which point the denoised image is quite clear.
  • Figure 4: The structure of latent diffusion model, where $E$ and $D$ represent the encoder and decoder respectively. The image obtained from SD.
  • Figure 5: The proposed taxonomy of diffusion model applications in RS. The images (a)-(c) are obtained from 52, 34, and 37, respectively.
  • ...and 10 more figures