Table of Contents
Fetching ...

Conditional Brownian Bridge Diffusion Model for VHR SAR to Optical Image Translation

Seon-Hoon Kim, Dae-Won Chung

TL;DR

The paper tackles translating 0.5m very-high-resolution SAR imagery to optical-like representations to aid interpretation. It introduces a conditional Brownian Bridge Diffusion Model (cBBDM) that couples Brownian bridge diffusion with pixel-space conditioning, operating in a latent space via a VQ-VAE backbone. Evaluated on the MSAW dataset of paired 0.5m SAR and WorldView-2 optical images, the proposed method outperforms GAN-based approaches and conditional Latent Diffusion Models across perceptual, spectral, and structural metrics, while preserving spatial details and robustness to speckle noise. This work advances cross-domain SAR-to-optical translation with higher fidelity and practical potential for SAR interpretation and downstream remote-sensing tasks.

Abstract

Synthetic Aperture Radar (SAR) imaging technology provides the unique advantage of being able to collect data regardless of weather conditions and time. However, SAR images exhibit complex backscatter patterns and speckle noise, which necessitate expertise for interpretation. Research on translating SAR images into optical-like representations has been conducted to aid the interpretation of SAR data. Nevertheless, existing studies have predominantly utilized low-resolution satellite imagery datasets and have largely been based on Generative Adversarial Network (GAN) which are known for their training instability and low fidelity. To overcome these limitations of low-resolution data usage and GAN-based approaches, this letter introduces a conditional image-to-image translation approach based on Brownian Bridge Diffusion Model (BBDM). We conducted comprehensive experiments on the MSAW dataset, a paired SAR and optical images collection of 0.5m Very-High-Resolution (VHR). The experimental results indicate that our method surpasses both the Conditional Diffusion Models (CDMs) and the GAN-based models in diverse perceptual quality metrics.

Conditional Brownian Bridge Diffusion Model for VHR SAR to Optical Image Translation

TL;DR

The paper tackles translating 0.5m very-high-resolution SAR imagery to optical-like representations to aid interpretation. It introduces a conditional Brownian Bridge Diffusion Model (cBBDM) that couples Brownian bridge diffusion with pixel-space conditioning, operating in a latent space via a VQ-VAE backbone. Evaluated on the MSAW dataset of paired 0.5m SAR and WorldView-2 optical images, the proposed method outperforms GAN-based approaches and conditional Latent Diffusion Models across perceptual, spectral, and structural metrics, while preserving spatial details and robustness to speckle noise. This work advances cross-domain SAR-to-optical translation with higher fidelity and practical potential for SAR interpretation and downstream remote-sensing tasks.

Abstract

Synthetic Aperture Radar (SAR) imaging technology provides the unique advantage of being able to collect data regardless of weather conditions and time. However, SAR images exhibit complex backscatter patterns and speckle noise, which necessitate expertise for interpretation. Research on translating SAR images into optical-like representations has been conducted to aid the interpretation of SAR data. Nevertheless, existing studies have predominantly utilized low-resolution satellite imagery datasets and have largely been based on Generative Adversarial Network (GAN) which are known for their training instability and low fidelity. To overcome these limitations of low-resolution data usage and GAN-based approaches, this letter introduces a conditional image-to-image translation approach based on Brownian Bridge Diffusion Model (BBDM). We conducted comprehensive experiments on the MSAW dataset, a paired SAR and optical images collection of 0.5m Very-High-Resolution (VHR). The experimental results indicate that our method surpasses both the Conditional Diffusion Models (CDMs) and the GAN-based models in diverse perceptual quality metrics.
Paper Structure (13 sections, 13 equations, 2 figures, 1 table)

This paper contains 13 sections, 13 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: The directed graphical models of diffusion based methods. (a) Conditional LDM. (b) BBDM. (c) Conditional BBDM. $X_0$ denotes the latent features of optical imagery and Y denotes the latent features of SAR imagery. BBDM framework directly translates $X_0$ from $X_T$ through Brownian bridge. Otherwise, conditional LDM framework gradually reconstructs $X_0$ from noisy $X_T$, guided by the condition from SAR imagery. Conditional BBDM employs condition to guide the direct mapping from $X_T$ to $X_0$. Both (b) and (c) depict a Brownian bridge process in the background of the diffusion process.
  • Figure 2: Results of VHR SAR to optical image translation using different methods. The first row shows SAR to optical image translation for an urban scene. The second row shows translation for trees and bare land.