Table of Contents
Fetching ...

Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection

Zhenghui Zhao, Lixiang Ru, Chen Wu

TL;DR

This work tackles weakly-supervised change detection by addressing change missing and fabricating through global-scale and local-scale priors. It introduces a Dilated Prior (DP) decoder and a Label Gated (LG) constraint, forming TransWCD-DL when combined with a lightweight transformer-based WSCD model. The approach achieves substantial gains over state-of-the-art WSCD methods on WHU-CD and competes with fully supervised methods while maintaining efficiency. The study also analyzes CAM behavior in WSCD and discusses the necessity of shortcuts in transformer-based WSCD, offering practical baselines and insights for future work.

Abstract

Weakly-supervised change detection (WSCD) aims to detect pixel-level changes with only image-level annotations. Owing to its label efficiency, WSCD is drawing increasing attention recently. However, current WSCD methods often encounter the challenge of change missing and fabricating, i.e., the inconsistency between image-level annotations and pixel-level predictions. Specifically, change missing refer to the situation that the WSCD model fails to predict any changed pixels, even though the image-level label indicates changed, and vice versa for change fabricating. To address this challenge, in this work, we leverage global-scale and local-scale priors in WSCD and propose two components: a Dilated Prior (DP) decoder and a Label Gated (LG) constraint. The DP decoder decodes samples with the changed image-level label, skips samples with the unchanged label, and replaces them with an all-unchanged pixel-level label. The LG constraint is derived from the correspondence between changed representations and image-level labels, penalizing the model when it mispredicts the change status. Additionally, we develop TransWCD, a simple yet powerful transformer-based model, showcasing the potential of weakly-supervised learning in change detection. By integrating the DP decoder and LG constraint into TransWCD, we form TransWCD-DL. Our proposed TransWCD and TransWCD-DL achieve significant +6.33% and +9.55% F1 score improvements over the state-of-the-art methods on the WHU-CD dataset, respectively. Some performance metrics even exceed several fully-supervised change detection (FSCD) competitors. Code will be available at https://github.com/zhenghuizhao/TransWCD.

Exploring Effective Priors and Efficient Models for Weakly-Supervised Change Detection

TL;DR

This work tackles weakly-supervised change detection by addressing change missing and fabricating through global-scale and local-scale priors. It introduces a Dilated Prior (DP) decoder and a Label Gated (LG) constraint, forming TransWCD-DL when combined with a lightweight transformer-based WSCD model. The approach achieves substantial gains over state-of-the-art WSCD methods on WHU-CD and competes with fully supervised methods while maintaining efficiency. The study also analyzes CAM behavior in WSCD and discusses the necessity of shortcuts in transformer-based WSCD, offering practical baselines and insights for future work.

Abstract

Weakly-supervised change detection (WSCD) aims to detect pixel-level changes with only image-level annotations. Owing to its label efficiency, WSCD is drawing increasing attention recently. However, current WSCD methods often encounter the challenge of change missing and fabricating, i.e., the inconsistency between image-level annotations and pixel-level predictions. Specifically, change missing refer to the situation that the WSCD model fails to predict any changed pixels, even though the image-level label indicates changed, and vice versa for change fabricating. To address this challenge, in this work, we leverage global-scale and local-scale priors in WSCD and propose two components: a Dilated Prior (DP) decoder and a Label Gated (LG) constraint. The DP decoder decodes samples with the changed image-level label, skips samples with the unchanged label, and replaces them with an all-unchanged pixel-level label. The LG constraint is derived from the correspondence between changed representations and image-level labels, penalizing the model when it mispredicts the change status. Additionally, we develop TransWCD, a simple yet powerful transformer-based model, showcasing the potential of weakly-supervised learning in change detection. By integrating the DP decoder and LG constraint into TransWCD, we form TransWCD-DL. Our proposed TransWCD and TransWCD-DL achieve significant +6.33% and +9.55% F1 score improvements over the state-of-the-art methods on the WHU-CD dataset, respectively. Some performance metrics even exceed several fully-supervised change detection (FSCD) competitors. Code will be available at https://github.com/zhenghuizhao/TransWCD.
Paper Structure (32 sections, 18 equations, 11 figures, 6 tables)

This paper contains 32 sections, 18 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Difference between fully-supervised change detection (FSCD) and weakly-supervised change detection (WSCD). WSCD utilizes image-level binary labels "Changed (Chg)" and "Unchanged (Unchg)" as supervision signals instead of ground truths.
  • Figure 2: Problem of change missing and fabricating in WSCD. False positives and false negatives are highlighted in red and blue, respectively. The 1st row illustrates the phenomenon of change fabricating in WCDNet ander2020 and FCD-GAN wu2023: the image-level label suggests no changes, but changed pixels are incorrectly identified. The 2nd row showcases change missing: the image-level label indicates changes, but no corresponding changed pixels are detected. Our TransWCD-DL significantly improves change missing and fabricating. For a comparison with more existing methods, refer to Fig. \ref{['experiment3']}.
  • Figure 3: Pipeline of TransWCD. TransWCD is our basic model, including (a) single-stream and (b) Siamese dual-stream scheme. These schemes both comprise 1) hierarchical transformer encoder with image-level supervision, 2) mono-layer minimalist difference module, and 3) shortcut-free multi-scale CAM prediction module.
  • Figure 4: Framework of TransWCD-DL. TransWCD-DL consists of TransWCD, Dilated Prior (DP) decoder, and Label Gated (LG) constraint. DP decoder decodes the upstream feature map when the corresponding image-level labels are "Changed (Chg)." Otherwise, it utilizes an all-unchanged pixel-level label as the supervisory signal. LG constraint is a label-adaptive feature-level regularizer that relies on the changed features within the output feature map. These two modules incorporate the global-scale and local-scale priors to improve change missing and fabricating.
  • Figure 5: Global-scale prior and Local-scale prior. (a) The global-scale prior states that the "Unchanged (Unchg)" image-level label is equivalent to all-zero pixel-level ground truth. (b) The local-scale prior states that if the image-level label is "Changed (Chg)," there should be a presence of changed regions; otherwise, there must be an absence of changed regions.
  • ...and 6 more figures