Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal
Wanli Ma, Oktay Karakus, Paul L. Rosin
TL;DR
The paper tackles cloud removal in optical remote sensing by transferring knowledge from Masked Autoencoder (MAE) reconstruction to cloud-free image generation within a Patch-GAN framework. It uses a ViT-large encoder–decoder pre-trained with MAE on ImageNet as the generator and a patch-wise discriminator, with a combined $L_{MSE}$ and $L_{GAN}$ objective and layer-wise learning rate decay during fine-tuning. Empirical results on the RICE1/RICE2 datasets show substantial improvements over GAN-based baselines and competitive performance against state-of-the-art methods, quantified by higher $PSNR$ and $SSIM$ values, albeit with caveats due to undisclosed data splits for some baselines. The proposed approach demonstrates the value of reconstructive transfer learning for remote sensing cloud removal, offering more accurate, structurally faithful cloud-free reconstructions and paving the way for integrating larger vision-language models in future work.
Abstract
Cloud removal plays a crucial role in enhancing remote sensing image analysis, yet accurately reconstructing cloud-obscured regions remains a significant challenge. Recent advancements in generative models have made the generation of realistic images increasingly accessible, offering new opportunities for this task. Given the conceptual alignment between image generation and cloud removal tasks, generative models present a promising approach for addressing cloud removal in remote sensing. In this work, we propose a deep transfer learning approach built on a generative adversarial network (GAN) framework to explore the potential of the novel masked autoencoder (MAE) image reconstruction model in cloud removal. Due to the complexity of remote sensing imagery, we further propose using a patch-wise discriminator to determine whether each patch of the image is real or not. The proposed reconstructive transfer learning approach demonstrates significant improvements in cloud removal performance compared to other GAN-based methods. Additionally, whilst direct comparisons with some of the state-of-the-art cloud removal techniques are limited due to unclear details regarding their train/test data splits, the proposed model achieves competitive results based on available benchmarks.
