Table of Contents
Fetching ...

Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery

Jia Jia, Geunho Lee, Zhibo Wang, Lyu Zhi, Yuchu He

TL;DR

SMDNet targets improved boundary delineation and robustness in high-resolution remote sensing change detection by merging a Siamese U2Net-based Feature Differential Encoder with a Denoising Diffusion Implicit Model. The approach leverages multi-scale differential features, spatial attention, and DDIM-based denoising to produce edge-aware change maps that are resilient to lighting and atmospheric variations. Empirical results on LEVIR-CD, DSIFN-CD, and CDD show competitive or superior F1 scores and IoU, confirming the method's effectiveness in capturing subtle changes and complex boundaries. The work highlights diffusion-based denoising as a viable pathway to enhance CD in RS imagery and points to future directions in lightweight designs and semi/self-supervised learning to scale the approach further.

Abstract

Recently, the application of deep learning to change detection (CD) has significantly progressed in remote sensing images. In recent years, CD tasks have mostly used architectures such as CNN and Transformer to identify these changes. However, these architectures have shortcomings in representing boundary details and are prone to false alarms and missed detections under complex lighting and weather conditions. For that, we propose a new network, Siamese Meets Diffusion Network (SMDNet). This network combines the Siam-U2Net Feature Differential Encoder (SU-FDE) and the denoising diffusion implicit model to improve the accuracy of image edge change detection and enhance the model's robustness under environmental changes. First, we propose an innovative SU-FDE module that utilizes shared weight features to capture differences between time series images and identify similarities between features to enhance edge detail detection. Furthermore, we add an attention mechanism to identify key coarse features to improve the model's sensitivity and accuracy. Finally, the diffusion model of progressive sampling is used to fuse key coarse features, and the noise reduction ability of the diffusion model and the advantages of capturing the probability distribution of image data are used to enhance the adaptability of the model in different environments. Our method's combination of feature extraction and diffusion models demonstrates effectiveness in change detection in remote sensing images. The performance evaluation of SMDNet on LEVIR-CD, DSIFN-CD, and CDD datasets yields validated F1 scores of 90.99%, 88.40%, and 88.47%, respectively. This substantiates the advanced capabilities of our model in accurately identifying variations and intricate details.

Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery

TL;DR

SMDNet targets improved boundary delineation and robustness in high-resolution remote sensing change detection by merging a Siamese U2Net-based Feature Differential Encoder with a Denoising Diffusion Implicit Model. The approach leverages multi-scale differential features, spatial attention, and DDIM-based denoising to produce edge-aware change maps that are resilient to lighting and atmospheric variations. Empirical results on LEVIR-CD, DSIFN-CD, and CDD show competitive or superior F1 scores and IoU, confirming the method's effectiveness in capturing subtle changes and complex boundaries. The work highlights diffusion-based denoising as a viable pathway to enhance CD in RS imagery and points to future directions in lightweight designs and semi/self-supervised learning to scale the approach further.

Abstract

Recently, the application of deep learning to change detection (CD) has significantly progressed in remote sensing images. In recent years, CD tasks have mostly used architectures such as CNN and Transformer to identify these changes. However, these architectures have shortcomings in representing boundary details and are prone to false alarms and missed detections under complex lighting and weather conditions. For that, we propose a new network, Siamese Meets Diffusion Network (SMDNet). This network combines the Siam-U2Net Feature Differential Encoder (SU-FDE) and the denoising diffusion implicit model to improve the accuracy of image edge change detection and enhance the model's robustness under environmental changes. First, we propose an innovative SU-FDE module that utilizes shared weight features to capture differences between time series images and identify similarities between features to enhance edge detail detection. Furthermore, we add an attention mechanism to identify key coarse features to improve the model's sensitivity and accuracy. Finally, the diffusion model of progressive sampling is used to fuse key coarse features, and the noise reduction ability of the diffusion model and the advantages of capturing the probability distribution of image data are used to enhance the adaptability of the model in different environments. Our method's combination of feature extraction and diffusion models demonstrates effectiveness in change detection in remote sensing images. The performance evaluation of SMDNet on LEVIR-CD, DSIFN-CD, and CDD datasets yields validated F1 scores of 90.99%, 88.40%, and 88.47%, respectively. This substantiates the advanced capabilities of our model in accurately identifying variations and intricate details.
Paper Structure (21 sections, 17 equations, 4 figures, 6 tables)

This paper contains 21 sections, 17 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The structural layout of the SMDNet network design. T1, T2, and $GT_{0}$ are the pre-change, post-change, and labeled images in the CDD dataset. (a) is the proposed Siam-U2Net Feature Differential Encoder (SU-FDE) for feature extraction of bitemporal image pairs; (b) is the Denoising UNet for denoising; (c) is the Spatial Attention Mechanism.
  • Figure 2: Residual U Block (RSU). It adopts the encoder and decoder designed by U-Net, which is symmetrical and has D-layer depth. In the figure example, the depth is 7 layers for the RSU 7 block.
  • Figure 3: Histogram comparison on CDD dataset using different attention mechanisms in SU-FDE of SMDNet
  • Figure 4: Visualization of experimental results on three datasets LEVIR-CD, DSIFN-CD, and CDD.