Adaptive Frequency Enhancement Network for Remote Sensing Image Semantic Segmentation
Feng Gao, Miao Fu, Jingchao Cao, Junyu Dong, Qian Du
TL;DR
AFENet addresses the challenge of adapting network parameters to diverse land-cover distributions in remote sensing image segmentation by introducing Adaptive Frequency and Spatial feature Interaction Module (AFSIM) and Selective feature Fusion Module (SFM). AFSIM adaptively separates high- and low-frequency information using FFT-based analysis and an Adaptive Window-mask Module (AWM), while SFM selectively fuses global context with local details through cross-domain attention. The model, built on a ResNet-18 backbone and reinforced by Transformer-based fusion, achieves state-of-the-art results on Vaihingen, Potsdam, and LoveDA, and its components are validated via comprehensive ablations. The work highlights substantial gains in edge precision and multi-scale segmentation, with code availability to promote reproducibility and further research.
Abstract
Semantic segmentation of high-resolution remote sensing images plays a crucial role in land-use monitoring and urban planning. Recent remarkable progress in deep learning-based methods makes it possible to generate satisfactory segmentation results. However, existing methods still face challenges in adapting network parameters to various land cover distributions and enhancing the interaction between spatial and frequency domain features. To address these challenges, we propose the Adaptive Frequency Enhancement Network (AFENet), which integrates two key components: the Adaptive Frequency and Spatial feature Interaction Module (AFSIM) and the Selective feature Fusion Module (SFM). AFSIM dynamically separates and modulates high- and low-frequency features according to the content of the input image. It adaptively generates two masks to separate high- and low-frequency components, therefore providing optimal details and contextual supplementary information for ground object feature representation. SFM selectively fuses global context and local detailed features to enhance the network's representation capability. Hence, the interactions between frequency and spatial features are further enhanced. Extensive experiments on three publicly available datasets demonstrate that the proposed AFENet outperforms state-of-the-art methods. In addition, we also validate the effectiveness of AFSIM and SFM in managing diverse land cover types and complex scenarios. Our codes are available at https://github.com/oucailab/AFENet.
