ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection
Mubashir Noman, Mustansar Fiaz, Hisham Cholakkal, Salman Khan, Fahad Shahbaz Khan
TL;DR
ELGC-Net addresses semantic change detection in high-resolution remote sensing imagery by integrating local spatial details and global contextual cues. The core contribution is the Efficient Local-Global Context Aggregator (ELGCA), which uses a pooled-transpose attention for global context and a depthwise convolution for local context, arranged in a parallel, channel-split design to reduce parameters. The architecture is Siamese, with fusion modules and a decoder, and a lighter ELGC-Net-LW variant achieves comparable accuracy with far fewer parameters and FLOPs, avoiding pre-trained backbones. Evaluations on LEVIR-CD, DSIFN-CD, and CDD-CD demonstrate state-of-the-art performance and robustness across diverse CD tasks, with clear improvements in IoU, F1, and OA metrics. This work offers a practical, efficient CD framework suitable for high-resolution imagery and resource-constrained environments, with potential real-time deployment on edge devices.
Abstract
Deep learning has shown remarkable success in remote sensing change detection (CD), aiming to identify semantic change regions between co-registered satellite image pairs acquired at distinct time stamps. However, existing convolutional neural network and transformer-based frameworks often struggle to accurately segment semantic change regions. Moreover, transformers-based methods with standard self-attention suffer from quadratic computational complexity with respect to the image resolution, making them less practical for CD tasks with limited training data. To address these issues, we propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions while reducing the model size. Our ELGC-Net comprises a Siamese encoder, fusion modules, and a decoder. The focus of our design is the introduction of an Efficient Local-Global Context Aggregator module within the encoder, capturing enhanced global context and local spatial information through a novel pooled-transpose (PT) attention and depthwise convolution, respectively. The PT attention employs pooling operations for robust feature extraction and minimizes computational cost with transposed attention. Extensive experiments on three challenging CD datasets demonstrate that ELGC-Net outperforms existing methods. Compared to the recent transformer-based CD approach (ChangeFormer), ELGC-Net achieves a 1.4% gain in intersection over union metric on the LEVIR-CD dataset, while significantly reducing trainable parameters. Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks. Finally, we also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings, while achieving comparable performance. Project url https://github.com/techmn/elgcnet.
