Table of Contents
Fetching ...

SAR to Optical Image Translation with Color Supervised Diffusion Model

Xinyu Bai, Feng Xu

TL;DR

The paper tackles the interpretability gap between SAR and optical imagery by translating SAR to optical images using a diffusion-based conditional model. A color-supervision mechanism and a 1×1 convolution-driven color path are integrated to address color shifts while guiding generation with SAR inputs. Evaluations on the SEN12 dataset show the proposed method outperforms GAN-based and diffusion baselines in PSNR, SSIM, and FID, while producing clearer boundaries and more faithful colors. Although inference is slower due to iterative sampling, the approach significantly enhances cross-modal interpretability for remote sensing applications and offers a robust alternative to prior methods.

Abstract

Synthetic Aperture Radar (SAR) offers all-weather, high-resolution imaging capabilities, but its complex imaging mechanism often poses challenges for interpretation. In response to these limitations, this paper introduces an innovative generative model designed to transform SAR images into more intelligible optical images, thereby enhancing the interpretability of SAR images. Specifically, our model backbone is based on the recent diffusion models, which have powerful generative capabilities. We employ SAR images as conditional guides in the sampling process and integrate color supervision to counteract color shift issues effectively. We conducted experiments on the SEN12 dataset and employed quantitative evaluations using peak signal-to-noise ratio, structural similarity, and fréchet inception distance. The results demonstrate that our model not only surpasses previous methods in quantitative assessments but also significantly enhances the visual quality of the generated images.

SAR to Optical Image Translation with Color Supervised Diffusion Model

TL;DR

The paper tackles the interpretability gap between SAR and optical imagery by translating SAR to optical images using a diffusion-based conditional model. A color-supervision mechanism and a 1×1 convolution-driven color path are integrated to address color shifts while guiding generation with SAR inputs. Evaluations on the SEN12 dataset show the proposed method outperforms GAN-based and diffusion baselines in PSNR, SSIM, and FID, while producing clearer boundaries and more faithful colors. Although inference is slower due to iterative sampling, the approach significantly enhances cross-modal interpretability for remote sensing applications and offers a robust alternative to prior methods.

Abstract

Synthetic Aperture Radar (SAR) offers all-weather, high-resolution imaging capabilities, but its complex imaging mechanism often poses challenges for interpretation. In response to these limitations, this paper introduces an innovative generative model designed to transform SAR images into more intelligible optical images, thereby enhancing the interpretability of SAR images. Specifically, our model backbone is based on the recent diffusion models, which have powerful generative capabilities. We employ SAR images as conditional guides in the sampling process and integrate color supervision to counteract color shift issues effectively. We conducted experiments on the SEN12 dataset and employed quantitative evaluations using peak signal-to-noise ratio, structural similarity, and fréchet inception distance. The results demonstrate that our model not only surpasses previous methods in quantitative assessments but also significantly enhances the visual quality of the generated images.
Paper Structure (12 sections, 7 equations, 3 figures, 1 table)

This paper contains 12 sections, 7 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Overview of the diffusion process.
  • Figure 2: Optical image generation conditioned on SAR images. The SAR image $c_{s}$ is dimensionally increased by a $1 \times 1$ convolution.
  • Figure 3: Translation results across various models. Each column corresponds to: (a) the SAR image; (b) the result generated by CycleGAN, (c) NiceGAN, (d) CRAN, (e) S2ODPM, (f) our proposed model, and (g) the real optical image (ground truth).