Table of Contents
Fetching ...

RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing

Jiamei Xiong, Xuefeng Yan, Yongzhen Wang, Wei Zhao, Xiao-Ping Zhang, Mingqiang Wei

TL;DR

This paper proposes a novel unified Fourier-aware diffusion model for remote sensing image dehazing, termed RSHazeDiff, and designs a global compensated learning module to utilize the Fourier transform to capture the global dependency features of input images, which can effectively mitigate the effects of boundary artifacts when processing fixed-size patches.

Abstract

Haze severely degrades the visual quality of remote sensing images and hampers the performance of road extraction, vehicle detection, and traffic flow monitoring. The emerging denoising diffusion probabilistic model (DDPM) exhibits the significant potential for dense haze removal with its strong generation ability. Since remote sensing images contain extensive small-scale texture structures, it is important to effectively restore image details from hazy images. However, current wisdom of DDPM fails to preserve image details and color fidelity well, limiting its dehazing capacity for remote sensing images. In this paper, we propose a novel unified Fourier-aware diffusion model for remote sensing image dehazing, termed RSHazeDiff. From a new perspective, RSHazeDiff explores the conditional DDPM to improve image quality in dense hazy scenarios, and it makes three key contributions. First, RSHazeDiff refines the training phase of diffusion process by performing noise estimation and reconstruction constraints in a coarse-to-fine fashion. Thus, it remedies the unpleasing results caused by the simple noise estimation constraint in DDPM. Second, by taking the frequency information as important prior knowledge during iterative sampling steps, RSHazeDiff can preserve more texture details and color fidelity in dehazed images. Third, we design a global compensated learning module to utilize the Fourier transform to capture the global dependency features of input images, which can effectively mitigate the effects of boundary artifacts when processing fixed-size patches. Experiments on both synthetic and real-world benchmarks validate the favorable performance of RSHazeDiff over state-of-the-art methods. Source code will be released at https://github.com/jm-xiong/RSHazeDiff.

RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing

TL;DR

This paper proposes a novel unified Fourier-aware diffusion model for remote sensing image dehazing, termed RSHazeDiff, and designs a global compensated learning module to utilize the Fourier transform to capture the global dependency features of input images, which can effectively mitigate the effects of boundary artifacts when processing fixed-size patches.

Abstract

Haze severely degrades the visual quality of remote sensing images and hampers the performance of road extraction, vehicle detection, and traffic flow monitoring. The emerging denoising diffusion probabilistic model (DDPM) exhibits the significant potential for dense haze removal with its strong generation ability. Since remote sensing images contain extensive small-scale texture structures, it is important to effectively restore image details from hazy images. However, current wisdom of DDPM fails to preserve image details and color fidelity well, limiting its dehazing capacity for remote sensing images. In this paper, we propose a novel unified Fourier-aware diffusion model for remote sensing image dehazing, termed RSHazeDiff. From a new perspective, RSHazeDiff explores the conditional DDPM to improve image quality in dense hazy scenarios, and it makes three key contributions. First, RSHazeDiff refines the training phase of diffusion process by performing noise estimation and reconstruction constraints in a coarse-to-fine fashion. Thus, it remedies the unpleasing results caused by the simple noise estimation constraint in DDPM. Second, by taking the frequency information as important prior knowledge during iterative sampling steps, RSHazeDiff can preserve more texture details and color fidelity in dehazed images. Third, we design a global compensated learning module to utilize the Fourier transform to capture the global dependency features of input images, which can effectively mitigate the effects of boundary artifacts when processing fixed-size patches. Experiments on both synthetic and real-world benchmarks validate the favorable performance of RSHazeDiff over state-of-the-art methods. Source code will be released at https://github.com/jm-xiong/RSHazeDiff.
Paper Structure (19 sections, 14 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 19 sections, 14 equations, 10 figures, 8 tables, 2 algorithms.

Figures (10)

  • Figure 1: Impact of different components in RSHazeDiff. From (a) to (f): (a) the hazy image, and the dehazing results of (b) Patch-based DDPM model, (c) Patch-based DDPM model with PTS, (d) Patch-based DDPM model with PTS and FIR module, (e) our full model (Patch-based DDPM model + PTS + FIR + GCL), and (f) the ground-truth image. PTS, FIR and GCL refer to the phased training strategy, Fourier-aware iterative refinement module and global compensated learning module, respectively.
  • Figure 2: The overall architecture of RSHazeDiff for RSID. RSHazeDiff contains a local reverse denoising process, a global compensated learning module, and a feature fusion module. The local reverse denoising process focuses on locally sampling over patches, while GCL module performs the intact images with global properties in Fourier domain. Then, the global-local feature representations are fused by the feature fusion module to produce final dehazed images. We decompose an input image into $D$ overlapping fixed-sized patches via ozdenizci2023restoring. $cond$ means that the $D$ patches are regarded as the conditions of local reverse denoising process. $y_{t}^{(i)}$ defines the $i$-th patch of the noisy image $y_{t}$ that gradually samples from the random noise patch $y_{T}^{(i)}$ to the haze-free patch $y_{0}^{(i)}$ by $t$ iterations. $f_{1}$ and $f_{2}$ represent the output of local reverse denoising process and GCL module, respectively.
  • Figure 3: The architecture of the Fourier-aware conditional diffusion model, which contains two processes: (a) The training process adopts the phased training strategy to produce finer dehazing results. The model is first optimized with noise estimation constraint, and then the coarse sampling result by reverse denoising process is refined with reconstruction constraint. Also, the well-designed FIR excavates the structure and color guided representation from the forward process in Fourier space for finer sampling results. (b) The sampling process intends to gradually convert a Gaussian noise into the restored image. Note that the above model performs patch-level restoration.
  • Figure 4: The detailed structure of the fast Fourier block, which can capture the long-range context by executing simple yet effective spectrum transform.
  • Figure 5: Qualitative comparisons on DHID dataset. From (a) to (h): (a) the hazy image, and the dehazing results of (b) AOD-Netli2017aod, (c) FFA-Net qin2020ffa, (d) DCIL zhang2022dense, (e) Trinity-Net chi2023trinity, (f) Focal-Net cui2023focal, (g) our RSHazeDiff, respectively, and (h) the ground-truth image. As observed, our RSHazeDiff can generate much clearer haze-free images with well-preserved details.
  • ...and 5 more figures