Table of Contents
Fetching ...

DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Change Detection

Wele Gedara Chaminda Bandara, Nithin Gopalakrishnan Nair, Vishal M. Patel

TL;DR

The paper addresses remote sensing change detection under limited labeled data by leveraging a denoising diffusion probabilistic model as a pre-trained feature extractor. It pre-trains a DDPM on unlabeled RS imagery, extracts multi-scale, multi-timestep features from the decoder, and fine-tunes a lightweight change detector on these representations. The approach achieves state-of-the-art performance on four public datasets, significantly outperforming baselines across F1, IoU, and OA, and demonstrates robustness to environmental perturbations. Public code is provided to facilitate adoption and replication of diffusion-based RS change detection.

Abstract

Remote sensing change detection is crucial for understanding the dynamics of our planet's surface, facilitating the monitoring of environmental changes, evaluating human impact, predicting future trends, and supporting decision-making. In this work, we introduce a novel approach for change detection that can leverage off-the-shelf, unlabeled remote sensing images in the training process by pre-training a Denoising Diffusion Probabilistic Model (DDPM) - a class of generative models used in image synthesis. DDPMs learn the training data distribution by gradually converting training images into a Gaussian distribution using a Markov chain. During inference (i.e., sampling), they can generate a diverse set of samples closer to the training distribution, starting from Gaussian noise, achieving state-of-the-art image synthesis results. However, in this work, our focus is not on image synthesis but on utilizing it as a pre-trained feature extractor for the downstream application of change detection. Specifically, we fine-tune a lightweight change classifier utilizing the feature representations produced by the pre-trained DDPM alongside change labels. Experiments conducted on the LEVIR-CD, WHU-CD, DSIFN-CD, and CDD datasets demonstrate that the proposed DDPM-CD method significantly outperforms the existing state-of-the-art change detection methods in terms of F1 score, IoU, and overall accuracy, highlighting the pivotal role of pre-trained DDPM as a feature extractor for downstream applications. We have made both the code and pre-trained models available at https://github.com/wgcban/ddpm-cd

DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Change Detection

TL;DR

The paper addresses remote sensing change detection under limited labeled data by leveraging a denoising diffusion probabilistic model as a pre-trained feature extractor. It pre-trains a DDPM on unlabeled RS imagery, extracts multi-scale, multi-timestep features from the decoder, and fine-tunes a lightweight change detector on these representations. The approach achieves state-of-the-art performance on four public datasets, significantly outperforming baselines across F1, IoU, and OA, and demonstrates robustness to environmental perturbations. Public code is provided to facilitate adoption and replication of diffusion-based RS change detection.

Abstract

Remote sensing change detection is crucial for understanding the dynamics of our planet's surface, facilitating the monitoring of environmental changes, evaluating human impact, predicting future trends, and supporting decision-making. In this work, we introduce a novel approach for change detection that can leverage off-the-shelf, unlabeled remote sensing images in the training process by pre-training a Denoising Diffusion Probabilistic Model (DDPM) - a class of generative models used in image synthesis. DDPMs learn the training data distribution by gradually converting training images into a Gaussian distribution using a Markov chain. During inference (i.e., sampling), they can generate a diverse set of samples closer to the training distribution, starting from Gaussian noise, achieving state-of-the-art image synthesis results. However, in this work, our focus is not on image synthesis but on utilizing it as a pre-trained feature extractor for the downstream application of change detection. Specifically, we fine-tune a lightweight change classifier utilizing the feature representations produced by the pre-trained DDPM alongside change labels. Experiments conducted on the LEVIR-CD, WHU-CD, DSIFN-CD, and CDD datasets demonstrate that the proposed DDPM-CD method significantly outperforms the existing state-of-the-art change detection methods in terms of F1 score, IoU, and overall accuracy, highlighting the pivotal role of pre-trained DDPM as a feature extractor for downstream applications. We have made both the code and pre-trained models available at https://github.com/wgcban/ddpm-cd
Paper Structure (20 sections, 18 equations, 8 figures, 3 tables)

This paper contains 20 sections, 18 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Images sampled from the DDPM model pre-trained on off-the-shelf remote sensing images. The generated images exhibit common objects typically observed in real remote sensing imagery, including buildings, trees, roads, vegetation, water surfaces, etc. This showcases the remarkable capability of diffusion models to grasp essential semantics from the training dataset. Although our primary focus isn't image synthesis, we explore the effectiveness of DDPM as a feature extractor for change detection.
  • Figure 2: Diffusion model as a directed graphical model ho2020denoising.
  • Figure 3: Block diagrams illustrating the proposed DDPM-CD approach. The proposed DDPM-CD involves three main steps: (I) Extraction of multi-scale $(i \in \{4,3,2,1,0\})$ and multi-timestep features $(t \in \{t_0, \cdots, t_n\})$ from pre-change $(I_A)$ and post-change $(I_B)$ images, denoted as $\{F_A^{t, i}\}$ and $\{F_B^{t, i}\}$, respectively. (II) Computation of difference feature representations at each hierarchical scale (let's say $i+1$) by the change decoder block $(f_d^{i+1}(\cdot))$, which takes multi-timestep features of pre-change and post-change images at the $i+1$-th scale, denoted as $\{F_A^{t,i+1}\}_{t=t_0}^{t=t_n}$ and $\{F_A^{t,i+1}\}_{t=t_0}^{t=t_n}$, along with the previous scale's difference feature representations $\widetilde{F}_D^{i}$ as inputs, and outputs the difference feature representations for the current scale, $\widetilde{F}_D^{i+1}$. (III) Cascading change decoder blocks across all spatial scales ($i=4$ to $i=0$) to form the hierarchical change decoder, represented as $f_d(\cdot)=f_d^4(f_d^3(\cdots f_d^0()))$. Finally, the output from the hierarchical change decoder $\widetilde{F}_D^0$ is fed into the change classifier $f_{cls}(\cdot)$ which predicts the change probability map $P_{cd}$.
  • Figure 4: Visualization of multi-scale, multi-timestep feature representations for a given input image shown in the top left corner, extracted from the pre-trained DDPM's decoder. These multi-scale, multi-timestep feature representations are used to fine-tune a hierarchical change decoder followed by a change classifier with change labels. Here, $i$ denotes the hierarchical layer of feature maps, and $(h/2^i,w/2^i)$ denotes the (height, width) of feature representations. Additionally, $t\in[0, T]$ represents the timestep which defines the variance of noise added to the input image prior to feeding it to DDPM.
  • Figure 5: Comparison of different state-of-the-art change detection methods on LEVIR-CD dataset: (a) Pre-change image, (b) Post-change image, (c) FC-EF, (d) FC-Siam-diff, (e) FC-Siam-conc, (f) DT-SCN, (g) BIT, (h) ChangeFormer, (i) DDPM-CD (ours), and (j) Ground-truth. Note that true positives (change class) are indicated in white, true negatives (no-change class) are indicated in black, and false positives plus false negatives indicates in red.
  • ...and 3 more figures