CDXLSTM: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

Zhenkai Wu; Xiaowen Ma; Rongrong Lian; Kai Zheng; Wei Zhang

CDXLSTM: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

Zhenkai Wu, Xiaowen Ma, Rongrong Lian, Kai Zheng, Wei Zhang

TL;DR

The paper tackles remote sensing change detection (RS-CD) by addressing the trade-offs of existing CNN, Transformer, and Mamba-based methods in balancing accuracy and efficiency. It introduces CDXLSTM, an XLSTM-based framework with a scale-specific Feature Enhancer (CTGP for global context in low-resolution features and CTSR for spatial refinement in high-resolution features) and a Cross-scale Interactive Fusion (CSIF) module to progressively combine global semantics with detailed spatial information. The architecture uses a Siamese backbone with Bi-mLSTM-based long-term modeling and axial Bi-mLSTM attention within CTSR, delivering linear computational complexity and improved interpretability. On LEVIR-CD, WHU-CD, and CLCD, CDXLSTM achieves state-of-the-art F1 scores with only 16.19M parameters and 3.92G FLOPs, outperforming recent methods while reducing compute, and the training losses combine BCE and Dice terms as $\mathcal{L} = \lambda_{ce}\mathcal{L}_{ce} + \lambda_{dice}\mathcal{L}_{dice}$ to supervise segmentation performance.

Abstract

In complex scenes and varied conditions, effectively integrating spatial-temporal context is crucial for accurately identifying changes. However, current RS-CD methods lack a balanced consideration of performance and efficiency. CNNs lack global context, Transformers are computationally expensive, and Mambas face CUDA dependence and local correlation loss. In this paper, we propose CDXLSTM, with a core component that is a powerful XLSTM-based feature enhancement layer, integrating the advantages of linear computational complexity, global context perception, and strong interpret-ability. Specifically, we introduce a scale-specific Feature Enhancer layer, incorporating a Cross-Temporal Global Perceptron customized for semantic-accurate deep features, and a Cross-Temporal Spatial Refiner customized for detail-rich shallow features. Additionally, we propose a Cross-Scale Interactive Fusion module to progressively interact global change representations with spatial responses. Extensive experimental results demonstrate that CDXLSTM achieves state-of-the-art performance across three benchmark datasets, offering a compelling balance between efficiency and accuracy. Code is available at https://github.com/xwmaxwma/rschange.

CDXLSTM: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

TL;DR

Abstract

CDXLSTM: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)