Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification
Zhu Han, Ce Zhang, Lianru Gao, Zhiqiang Zeng, Michael K. Ng, Bing Zhang, Jocelyn Chanussot
TL;DR
This work tackles cross-scene remote sensing image classification under domain shift by introducing MS-CDG, a multisource collaboration framework. It combines data-aware adversarial augmentation, which learns semantic and domain changes across channels, with model-aware multi-level diversification that exploits cross-domain prototypes and intra-domain kernel-based relationships to produce robust domain-invariant representations. Key contributions include the partly weight-sharing adversary with semantic guidance, the domain encoder with SpaR and ChaR, the cross-domain prototype clustering with multi-head attention, and the kernel mixture module for intra-class structure, all trained with a distribution-consistency constraint. Empirical results on three public multisource RS datasets (Houston, Germany, LCZ) show MS-CDG outperforms state-of-the-art DA and DG methods, demonstrating improved generalization to unseen scenes and stronger class discrimination across diverse sensor modalities.
Abstract
Cross-scene image classification aims to transfer prior knowledge of ground materials to annotate regions with different distributions and reduce hand-crafted cost in the field of remote sensing. However, existing approaches focus on single-source domain generalization to unseen target domains, and are easily confused by large real-world domain shifts due to the limited training information and insufficient diversity modeling capacity. To address this gap, we propose a novel multi-source collaborative domain generalization framework (MS-CDG) based on homogeneity and heterogeneity characteristics of multi-source remote sensing data, which considers data-aware adversarial augmentation and model-aware multi-level diversification simultaneously to enhance cross-scene generalization performance. The data-aware adversarial augmentation adopts an adversary neural network with semantic guide to generate MS samples by adaptively learning realistic channel and distribution changes across domains. In views of cross-domain and intra-domain modeling, the model-aware diversification transforms the shared spatial-channel features of MS data into the class-wise prototype and kernel mixture module, to address domain discrepancies and cluster different classes effectively. Finally, the joint classification of original and augmented MS samples is employed by introducing a distribution consistency alignment to increase model diversity and ensure better domain-invariant representation learning. Extensive experiments on three public MS remote sensing datasets demonstrate the superior performance of the proposed method when benchmarked with the state-of-the-art methods.
