Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Xianping Ma, Ziyao Wang, Yin Hu, Xiaokang Zhang, Man-On Pun
TL;DR
The paper addresses the challenge of leveraging high-dimensional encoder features for accurate remote sensing semantic segmentation by introducing DeepKANSeg, a novel encoder-decoder network built on Kolmogorov–Arnold Networks. It replaces traditional MLP-based decoding with GLKAN and uses a stacked DeepKAN feature refinement module to decompose complex high-dimensional representations into univariate transformations, enabling improved feature learning and interpretability, formalized as $f(\boldsymbol{x}) = \sum_q \phi_q\left(\sum_p \psi_{pq}(x_p)\right)$. Evaluated on ISPRS Vaihingen and Potsdam with ResNet-18 and ViT-L backbones, DeepKANSeg delivers superior mF1 and mIoU across classes, with notable gains in finely detailed structures and long-range context. The findings demonstrate the potential of KAN-based modules to enhance remote sensing segmentation, while acknowledging increased computational complexity and reliance on pretrained encoders as areas for future optimization and interpretability expansion.
Abstract
Semantic segmentation plays a crucial role in remote sensing applications, where the accurate extraction and representation of features are essential for high-quality results. Despite the widespread use of encoder-decoder architectures, existing methods often struggle with fully utilizing the high-dimensional features extracted by the encoder and efficiently recovering detailed information during decoding. To address these problems, we propose a novel semantic segmentation network, namely DeepKANSeg, including two key innovations based on the emerging Kolmogorov Arnold Network (KAN). Notably, the advantage of KAN lies in its ability to decompose high-dimensional complex functions into univariate transformations, enabling efficient and flexible representation of intricate relationships in data. First, we introduce a KAN-based deep feature refinement module, namely DeepKAN to effectively capture complex spatial and rich semantic relationships from high-dimensional features. Second, we replace the traditional multi-layer perceptron (MLP) layers in the global-local combined decoder with KAN-based linear layers, namely GLKAN. This module enhances the decoder's ability to capture fine-grained details during decoding. To evaluate the effectiveness of the proposed method, experiments are conducted on two well-known fine-resolution remote sensing benchmark datasets, namely ISPRS Vaihingen and ISPRS Potsdam. The results demonstrate that the KAN-enhanced segmentation model achieves superior performance in terms of accuracy compared to state-of-the-art methods. They highlight the potential of KANs as a powerful alternative to traditional architectures in semantic segmentation tasks. Moreover, the explicit univariate decomposition provides improved interpretability, which is particularly beneficial for applications requiring explainable learning in remote sensing.
