Table of Contents
Fetching ...

Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal

Wanli Ma, Oktay Karakus, Paul L. Rosin

TL;DR

The paper tackles cloud removal in optical remote sensing by transferring knowledge from Masked Autoencoder (MAE) reconstruction to cloud-free image generation within a Patch-GAN framework. It uses a ViT-large encoder–decoder pre-trained with MAE on ImageNet as the generator and a patch-wise discriminator, with a combined $L_{MSE}$ and $L_{GAN}$ objective and layer-wise learning rate decay during fine-tuning. Empirical results on the RICE1/RICE2 datasets show substantial improvements over GAN-based baselines and competitive performance against state-of-the-art methods, quantified by higher $PSNR$ and $SSIM$ values, albeit with caveats due to undisclosed data splits for some baselines. The proposed approach demonstrates the value of reconstructive transfer learning for remote sensing cloud removal, offering more accurate, structurally faithful cloud-free reconstructions and paving the way for integrating larger vision-language models in future work.

Abstract

Cloud removal plays a crucial role in enhancing remote sensing image analysis, yet accurately reconstructing cloud-obscured regions remains a significant challenge. Recent advancements in generative models have made the generation of realistic images increasingly accessible, offering new opportunities for this task. Given the conceptual alignment between image generation and cloud removal tasks, generative models present a promising approach for addressing cloud removal in remote sensing. In this work, we propose a deep transfer learning approach built on a generative adversarial network (GAN) framework to explore the potential of the novel masked autoencoder (MAE) image reconstruction model in cloud removal. Due to the complexity of remote sensing imagery, we further propose using a patch-wise discriminator to determine whether each patch of the image is real or not. The proposed reconstructive transfer learning approach demonstrates significant improvements in cloud removal performance compared to other GAN-based methods. Additionally, whilst direct comparisons with some of the state-of-the-art cloud removal techniques are limited due to unclear details regarding their train/test data splits, the proposed model achieves competitive results based on available benchmarks.

Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal

TL;DR

The paper tackles cloud removal in optical remote sensing by transferring knowledge from Masked Autoencoder (MAE) reconstruction to cloud-free image generation within a Patch-GAN framework. It uses a ViT-large encoder–decoder pre-trained with MAE on ImageNet as the generator and a patch-wise discriminator, with a combined and objective and layer-wise learning rate decay during fine-tuning. Empirical results on the RICE1/RICE2 datasets show substantial improvements over GAN-based baselines and competitive performance against state-of-the-art methods, quantified by higher and values, albeit with caveats due to undisclosed data splits for some baselines. The proposed approach demonstrates the value of reconstructive transfer learning for remote sensing cloud removal, offering more accurate, structurally faithful cloud-free reconstructions and paving the way for integrating larger vision-language models in future work.

Abstract

Cloud removal plays a crucial role in enhancing remote sensing image analysis, yet accurately reconstructing cloud-obscured regions remains a significant challenge. Recent advancements in generative models have made the generation of realistic images increasingly accessible, offering new opportunities for this task. Given the conceptual alignment between image generation and cloud removal tasks, generative models present a promising approach for addressing cloud removal in remote sensing. In this work, we propose a deep transfer learning approach built on a generative adversarial network (GAN) framework to explore the potential of the novel masked autoencoder (MAE) image reconstruction model in cloud removal. Due to the complexity of remote sensing imagery, we further propose using a patch-wise discriminator to determine whether each patch of the image is real or not. The proposed reconstructive transfer learning approach demonstrates significant improvements in cloud removal performance compared to other GAN-based methods. Additionally, whilst direct comparisons with some of the state-of-the-art cloud removal techniques are limited due to unclear details regarding their train/test data splits, the proposed model achieves competitive results based on available benchmarks.
Paper Structure (11 sections, 7 equations, 3 figures, 2 tables)

This paper contains 11 sections, 7 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The overall framework of the proposed deep transfer learning.
  • Figure 2: Dataset samples of RICE dataset. The top row displays cloudy images, while the bottom row shows cloud-free images. The first two samples on the left are from RICE1, while the two samples on the right are from RICE2.
  • Figure 3: Visual results on the RICE1 (top) and RICE2 (bottom) datasets. From left to right are the cloud-covered image, the generated cloud-free image, and the ground truth cloud-free image.