Table of Contents
Fetching ...

Band Prompting Aided SAR and Multi-Spectral Data Fusion Framework for Local Climate Zone Classification

Haiyan Lan, Shujun Li, Mingjie Xie, Xuanjia Zhao, Hongning Liu, Pengming Feng, Dongli Xu, Guangjun He, Jian Guan

TL;DR

This work tackles LCZ classification by fusing SAR and multispectral data through a band-aware, text-guided framework called BP-LCZ. It introduces band grouping to decompose multimodal data, a band group prompting (BGP) strategy to align band-group representations with descriptive prompts, and a multivariate supervised matrix (MSM) to reduce positive/negative sample confusion in contrastive learning. Empirical results on the So2Sat LCZ42 dataset show substantial gains for RS-specific architectures, with EB-CNN and ExViT gaining notable improvements in OA and Kappa when equipped with BP-LCZ, and Ablation studies confirming the complementary benefits of BGP and MSM. The approach advances multimodal fusion in remote sensing by leveraging textual prompts to encode physical band properties and semantic categories, offering practical gains for urban climate-related mapping while signaling remaining challenges related to domain shift across geographic regions.

Abstract

Local climate zone (LCZ) classification is of great value for understanding the complex interactions between urban development and local climate. Recent studies have increasingly focused on the fusion of synthetic aperture radar (SAR) and multi-spectral data to improve LCZ classification performance. However, it remains challenging due to the distinct physical properties of these two types of data and the absence of effective fusion guidance. In this paper, a novel band prompting aided data fusion framework is proposed for LCZ classification, namely BP-LCZ, which utilizes textual prompts associated with band groups to guide the model in learning the physical attributes of different bands and semantics of various categories inherent in SAR and multi-spectral data to augment the fused feature, thus enhancing LCZ classification performance. Specifically, a band group prompting (BGP) strategy is introduced to align the visual representation effectively at the level of band groups, which also facilitates a more adequate extraction of semantic information of different bands with textual information. In addition, a multivariate supervised matrix (MSM) based training strategy is proposed to alleviate the problem of positive and negative sample confusion by completing the supervised information. The experimental results demonstrate the effectiveness and superiority of the proposed data fusion framework.

Band Prompting Aided SAR and Multi-Spectral Data Fusion Framework for Local Climate Zone Classification

TL;DR

This work tackles LCZ classification by fusing SAR and multispectral data through a band-aware, text-guided framework called BP-LCZ. It introduces band grouping to decompose multimodal data, a band group prompting (BGP) strategy to align band-group representations with descriptive prompts, and a multivariate supervised matrix (MSM) to reduce positive/negative sample confusion in contrastive learning. Empirical results on the So2Sat LCZ42 dataset show substantial gains for RS-specific architectures, with EB-CNN and ExViT gaining notable improvements in OA and Kappa when equipped with BP-LCZ, and Ablation studies confirming the complementary benefits of BGP and MSM. The approach advances multimodal fusion in remote sensing by leveraging textual prompts to encode physical band properties and semantic categories, offering practical gains for urban climate-related mapping while signaling remaining challenges related to domain shift across geographic regions.

Abstract

Local climate zone (LCZ) classification is of great value for understanding the complex interactions between urban development and local climate. Recent studies have increasingly focused on the fusion of synthetic aperture radar (SAR) and multi-spectral data to improve LCZ classification performance. However, it remains challenging due to the distinct physical properties of these two types of data and the absence of effective fusion guidance. In this paper, a novel band prompting aided data fusion framework is proposed for LCZ classification, namely BP-LCZ, which utilizes textual prompts associated with band groups to guide the model in learning the physical attributes of different bands and semantics of various categories inherent in SAR and multi-spectral data to augment the fused feature, thus enhancing LCZ classification performance. Specifically, a band group prompting (BGP) strategy is introduced to align the visual representation effectively at the level of band groups, which also facilitates a more adequate extraction of semantic information of different bands with textual information. In addition, a multivariate supervised matrix (MSM) based training strategy is proposed to alleviate the problem of positive and negative sample confusion by completing the supervised information. The experimental results demonstrate the effectiveness and superiority of the proposed data fusion framework.

Paper Structure

This paper contains 14 sections, 11 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The overall framework of the proposed method, which consists of four steps, i.e., band grouping, band group prompting (BGP), multivariate supervised matrix (MSM) and classification. BGP is used to align image features and text features during training, so as to guide the model to learn the physical attributes of different bands and semantics of different categories contained in SAR and multi-spectral data to augment the feature fusion. MSM is used to alleviate the positive and negative sample confusion problem.
  • Figure 2: Confusion matrix of the classification results from EB-CNN with the proposed BP-LCZ method.
  • Figure 3: The t-SNE visualization of EB-CNN and EB-CNN (BP-LCZ).