Table of Contents
Fetching ...

LDGNet: A Lightweight Difference Guiding Network for Remote Sensing Change Detection

Chenfeng Xu

TL;DR

LDGNet tackles the challenge of efficient optical remote-sensing change detection by introducing absolute-difference guided encoding and decoding. It employs a Difference Guidance Module (DGM) to progressively bias a lightweight encoder with multi-scale difference features and a Difference-Aware Dynamic Fusion (DADF) module powered by a Visual State Space Model (VSSM) for targeted, low-cost fusion in decoding. The approach delivers competitive or superior performance with only 3.43M parameters and 1.12G FLOPs across four public datasets, while robustly suppressing noise and background interference. This work enables practical edge deployment by balancing accuracy and computational efficiency in land-cover change analysis.

Abstract

With the rapid advancement of deep learning, the field of change detection (CD) in remote sensing imagery has achieved remarkable progress. Existing change detection methods primarily focus on achieving higher accuracy with increased computational costs and parameter sizes, leaving development of lightweight methods for rapid real-world processing an underexplored challenge. To address this challenge, we propose a Lightweight Difference Guiding Network (LDGNet), leveraging absolute difference image to guide optical remote sensing change detection. First, to enhance the feature representation capability of the lightweight backbone network, we propose the Difference Guiding Module (DGM), which leverages multi-scale features extracted from the absolute difference image to progressively influence the original image encoder at each layer, thereby reinforcing feature extraction. Second, we propose the Difference-Aware Dynamic Fusion (DADF) module with Visual State Space Model (VSSM) for lightweight long-range dependency modeling. The module first uses feature absolute differences to guide VSSM's global contextual modeling of change regions, then employs difference attention to dynamically fuse these long-range features with feature differences, enhancing change semantics while suppressing noise and background. Extensive experiments on multiple datasets demonstrate that our method achieves comparable or superior performance to current state-of-the-art (SOTA) methods requiring several times more computation, while maintaining only 3.43M parameters and 1.12G FLOPs.

LDGNet: A Lightweight Difference Guiding Network for Remote Sensing Change Detection

TL;DR

LDGNet tackles the challenge of efficient optical remote-sensing change detection by introducing absolute-difference guided encoding and decoding. It employs a Difference Guidance Module (DGM) to progressively bias a lightweight encoder with multi-scale difference features and a Difference-Aware Dynamic Fusion (DADF) module powered by a Visual State Space Model (VSSM) for targeted, low-cost fusion in decoding. The approach delivers competitive or superior performance with only 3.43M parameters and 1.12G FLOPs across four public datasets, while robustly suppressing noise and background interference. This work enables practical edge deployment by balancing accuracy and computational efficiency in land-cover change analysis.

Abstract

With the rapid advancement of deep learning, the field of change detection (CD) in remote sensing imagery has achieved remarkable progress. Existing change detection methods primarily focus on achieving higher accuracy with increased computational costs and parameter sizes, leaving development of lightweight methods for rapid real-world processing an underexplored challenge. To address this challenge, we propose a Lightweight Difference Guiding Network (LDGNet), leveraging absolute difference image to guide optical remote sensing change detection. First, to enhance the feature representation capability of the lightweight backbone network, we propose the Difference Guiding Module (DGM), which leverages multi-scale features extracted from the absolute difference image to progressively influence the original image encoder at each layer, thereby reinforcing feature extraction. Second, we propose the Difference-Aware Dynamic Fusion (DADF) module with Visual State Space Model (VSSM) for lightweight long-range dependency modeling. The module first uses feature absolute differences to guide VSSM's global contextual modeling of change regions, then employs difference attention to dynamically fuse these long-range features with feature differences, enhancing change semantics while suppressing noise and background. Extensive experiments on multiple datasets demonstrate that our method achieves comparable or superior performance to current state-of-the-art (SOTA) methods requiring several times more computation, while maintaining only 3.43M parameters and 1.12G FLOPs.

Paper Structure

This paper contains 12 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Illustration of our method. (a) represents overall architecture of our network. The encoder consists of an independent difference feature extractor and a Siamese raw image encoder with shared weights. For each layer, the features extracted independently by the difference extractor are applied to the raw image encoder via the DGM. The decoder is composed of four DADF modules. The pre-event and post-event features are weighted with feature differences and fused through the VSS Block, then dynamically combined with feature differences through DFM. After upsampling and summation with the features from the previous layer, the result is fed into the next DADF module. (b) represents the detailed structure of the two components of the DGM: DA and SCA. (c) represents the structure of the DADF and the detailed construction of VSS Block.
  • Figure 2: Qualitative results comparison across four datasets. White represents True Positives (TP), red represents False Positives (FP), green represents False Negatives (FN), and black represents True Negatives (TN). The leftmost three columns show bi-temporal images and ground truth. All method names are labeled in the bottom row. Cases (a)-(b) are from SYSU-CD, (c)-(d) from LEVIR-CD, (e)-(f) from WHU-CD, and (g)-(h) from DSIFN-CD datasets.
  • Figure 3: Comparison of GPU memory consumption across different methods as a function of input image size.
  • Figure 4: Heatmap comparison. Response intensity in shadow areas (marked by red boxes in T2) is weaken.
  • Figure 5: Interference intensity and inference results. X-axis represents the standard deviation of the Gaussian function. Gaussian blur kernel size is fixed at 3×3.