Table of Contents
Fetching ...

LKASeg:Remote-Sensing Image Semantic Segmentation with Large Kernel Attention and Full-Scale Skip Connections

Xuezhi Xiang, Yibo Ning, Lei Zhang, Denis Ombati, Himaloy Himu, Xiantong Zhen

TL;DR

This work tackles semantic segmentation of high-resolution remote-sensing images by addressing limitations of CNNs and Transformers. It introduces LKASeg, which fuses a Large Kernel Attention (LKA) based decoder with Full-Scale Skip Connections (FSC) to capture global context while maintaining computational efficiency. The key contributions are (1) a decoder based on LKA that yields long-range, channel-adaptive features with reduced overhead, (2) FSC that enables full-scale, multi-level feature learning and fusion between encoder and decoder, and (3) empirical validation on the ISPRS Vaihingen dataset showing improvements in mean F1 and IoU (mF1 = 90.33% and mIoU = 82.77%). The results suggest LKASeg effectively handles scale variation and spatial detail preservation in remote-sensing semantic segmentation, with practical impact for geospatial analysis.

Abstract

Semantic segmentation of remote sensing images is a fundamental task in geospatial research. However, widely used Convolutional Neural Networks (CNNs) and Transformers have notable drawbacks: CNNs may be limited by insufficient remote sensing modeling capability, while Transformers face challenges due to computational complexity. In this paper, we propose a remote-sensing image semantic segmentation network named LKASeg, which combines Large Kernel Attention(LSKA) and Full-Scale Skip Connections(FSC). Specifically, we propose a decoder based on Large Kernel Attention (LKA), which extract global features while avoiding the computational overhead of self-attention and providing channel adaptability. To achieve full-scale feature learning and fusion, we apply Full-Scale Skip Connections (FSC) between the encoder and decoder. We conducted experiments by combining the LKA-based decoder with FSC. On the ISPRS Vaihingen dataset, the mF1 and mIoU scores achieved 90.33% and 82.77%.

LKASeg:Remote-Sensing Image Semantic Segmentation with Large Kernel Attention and Full-Scale Skip Connections

TL;DR

This work tackles semantic segmentation of high-resolution remote-sensing images by addressing limitations of CNNs and Transformers. It introduces LKASeg, which fuses a Large Kernel Attention (LKA) based decoder with Full-Scale Skip Connections (FSC) to capture global context while maintaining computational efficiency. The key contributions are (1) a decoder based on LKA that yields long-range, channel-adaptive features with reduced overhead, (2) FSC that enables full-scale, multi-level feature learning and fusion between encoder and decoder, and (3) empirical validation on the ISPRS Vaihingen dataset showing improvements in mean F1 and IoU (mF1 = 90.33% and mIoU = 82.77%). The results suggest LKASeg effectively handles scale variation and spatial detail preservation in remote-sensing semantic segmentation, with practical impact for geospatial analysis.

Abstract

Semantic segmentation of remote sensing images is a fundamental task in geospatial research. However, widely used Convolutional Neural Networks (CNNs) and Transformers have notable drawbacks: CNNs may be limited by insufficient remote sensing modeling capability, while Transformers face challenges due to computational complexity. In this paper, we propose a remote-sensing image semantic segmentation network named LKASeg, which combines Large Kernel Attention(LSKA) and Full-Scale Skip Connections(FSC). Specifically, we propose a decoder based on Large Kernel Attention (LKA), which extract global features while avoiding the computational overhead of self-attention and providing channel adaptability. To achieve full-scale feature learning and fusion, we apply Full-Scale Skip Connections (FSC) between the encoder and decoder. We conducted experiments by combining the LKA-based decoder with FSC. On the ISPRS Vaihingen dataset, the mF1 and mIoU scores achieved 90.33% and 82.77%.

Paper Structure

This paper contains 11 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Illustration of the LKASeg.
  • Figure 2: Illustration of the LKA.
  • Figure 3: Qualitative performance comparisons on the ISPRS Vaihaigen with the size of 512 × 512. (a) NIRRG images, (b) Ground truth, (c) ABCNet, (d) TransUNet, (e) BANet, (f) MARes-Unet, (g) UNetformer, (h) CMTFNet and (i) LKASeg.