Table of Contents
Fetching ...

Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification

Zhu Han, Ce Zhang, Lianru Gao, Zhiqiang Zeng, Michael K. Ng, Bing Zhang, Jocelyn Chanussot

TL;DR

This work tackles cross-scene remote sensing image classification under domain shift by introducing MS-CDG, a multisource collaboration framework. It combines data-aware adversarial augmentation, which learns semantic and domain changes across channels, with model-aware multi-level diversification that exploits cross-domain prototypes and intra-domain kernel-based relationships to produce robust domain-invariant representations. Key contributions include the partly weight-sharing adversary with semantic guidance, the domain encoder with SpaR and ChaR, the cross-domain prototype clustering with multi-head attention, and the kernel mixture module for intra-class structure, all trained with a distribution-consistency constraint. Empirical results on three public multisource RS datasets (Houston, Germany, LCZ) show MS-CDG outperforms state-of-the-art DA and DG methods, demonstrating improved generalization to unseen scenes and stronger class discrimination across diverse sensor modalities.

Abstract

Cross-scene image classification aims to transfer prior knowledge of ground materials to annotate regions with different distributions and reduce hand-crafted cost in the field of remote sensing. However, existing approaches focus on single-source domain generalization to unseen target domains, and are easily confused by large real-world domain shifts due to the limited training information and insufficient diversity modeling capacity. To address this gap, we propose a novel multi-source collaborative domain generalization framework (MS-CDG) based on homogeneity and heterogeneity characteristics of multi-source remote sensing data, which considers data-aware adversarial augmentation and model-aware multi-level diversification simultaneously to enhance cross-scene generalization performance. The data-aware adversarial augmentation adopts an adversary neural network with semantic guide to generate MS samples by adaptively learning realistic channel and distribution changes across domains. In views of cross-domain and intra-domain modeling, the model-aware diversification transforms the shared spatial-channel features of MS data into the class-wise prototype and kernel mixture module, to address domain discrepancies and cluster different classes effectively. Finally, the joint classification of original and augmented MS samples is employed by introducing a distribution consistency alignment to increase model diversity and ensure better domain-invariant representation learning. Extensive experiments on three public MS remote sensing datasets demonstrate the superior performance of the proposed method when benchmarked with the state-of-the-art methods.

Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification

TL;DR

This work tackles cross-scene remote sensing image classification under domain shift by introducing MS-CDG, a multisource collaboration framework. It combines data-aware adversarial augmentation, which learns semantic and domain changes across channels, with model-aware multi-level diversification that exploits cross-domain prototypes and intra-domain kernel-based relationships to produce robust domain-invariant representations. Key contributions include the partly weight-sharing adversary with semantic guidance, the domain encoder with SpaR and ChaR, the cross-domain prototype clustering with multi-head attention, and the kernel mixture module for intra-class structure, all trained with a distribution-consistency constraint. Empirical results on three public multisource RS datasets (Houston, Germany, LCZ) show MS-CDG outperforms state-of-the-art DA and DG methods, demonstrating improved generalization to unseen scenes and stronger class discrimination across diverse sensor modalities.

Abstract

Cross-scene image classification aims to transfer prior knowledge of ground materials to annotate regions with different distributions and reduce hand-crafted cost in the field of remote sensing. However, existing approaches focus on single-source domain generalization to unseen target domains, and are easily confused by large real-world domain shifts due to the limited training information and insufficient diversity modeling capacity. To address this gap, we propose a novel multi-source collaborative domain generalization framework (MS-CDG) based on homogeneity and heterogeneity characteristics of multi-source remote sensing data, which considers data-aware adversarial augmentation and model-aware multi-level diversification simultaneously to enhance cross-scene generalization performance. The data-aware adversarial augmentation adopts an adversary neural network with semantic guide to generate MS samples by adaptively learning realistic channel and distribution changes across domains. In views of cross-domain and intra-domain modeling, the model-aware diversification transforms the shared spatial-channel features of MS data into the class-wise prototype and kernel mixture module, to address domain discrepancies and cluster different classes effectively. Finally, the joint classification of original and augmented MS samples is employed by introducing a distribution consistency alignment to increase model diversity and ensure better domain-invariant representation learning. Extensive experiments on three public MS remote sensing datasets demonstrate the superior performance of the proposed method when benchmarked with the state-of-the-art methods.

Paper Structure

This paper contains 21 sections, 17 equations, 11 figures, 9 tables, 1 algorithm.

Figures (11)

  • Figure 1: Illustrative comparison for MS data augmentation and diversity modeling on different cross-scene methods. (a) Existing cross-scene framework. (b) Our proposed MS-CDG framework.
  • Figure 2: The framework of the proposed MS-CDG, including data-aware adversarial augmentation and model-aware multi-level diversification. The multi-domain data augmentation is designed by a partly weight-sharing adversary neural network with the semantic guide to generate the mixed MS images, and further fed into the trained backbone to achieve joint classification. The embedding features of cross-domain and intra-domain levels are simultaneously optimized by modeling pixel-to-prototype and high-order intra-class compactness relationships for different domains to enhance domain-invariant representation capability.
  • Figure 3: Architecture of the domain encoder consisting of spatial randomization (SpaR) and channel randomization (ChaR) for different domains. The generated multi-dimensional domain information is further sent to achieve intra-domain (ID) and cross-domain (CD) feature fusion.
  • Figure 4: The conceptual illustration of different feature fusion strategies. (a) Cross-domain feature fusion. (b) Intra-domain feature fusion.
  • Figure 5: MS remote sensing datasets, including two remote sensing data sources and the corresponding ground-truth (GT) map. (a) Houston 2013 dataset. (b) Houston 2018 dataset. (c) Augsburg dataset. (d) Berlin dataset. (e) LCZ Berlin dataset. (f) LCZ Hong Kong dataset.
  • ...and 6 more figures