Table of Contents
Fetching ...

CrossEarth-Gate: Fisher-Guided Adaptive Tuning Engine for Efficient Adaptation of Cross-Domain Remote Sensing Semantic Segmentation

Shilei Cao, Ziyang Gong, Hehai Lin, Yang Liu, Jiashun Cheng, Xiaoxing Hu, Haoyuan Liang, Guowen Li, Chengwei Qin, Hong Cheng, Xue Yang, Juepeng Zheng, Haohuan Fu

TL;DR

CrossEarth-Gate tackles the challenge of activating large Geospatial Foundation Models for cross-domain RS semantic segmentation by introducing a structured Remote Sensing Module Toolbox and a Fisher Information–guided adaptive selection mechanism. The toolbox integrates spatial (LoRA), semantic (Adapter), and frequency (Earth-Adapter) modules inserted throughout the Transformer backbone, while the selection mechanism dynamically gates the most impactful modules to maximize task-specific gradient flow. Empirical results across 18 RS DG/DA benchmarks show state-of-the-art performance with up to 3.2 percentage points gains in mIoU and strong generalization across climate zones, disaster scenarios, and unlabeled target domains, all with minimal parameter updates. The work also provides thorough ablations and qualitative analyses, demonstrating the necessity of both the diversified toolbox and principled gradient gating for robust RS domain adaptation, and it announces code release for reproducibility and practical impact.

Abstract

In Remote Sensing (RS), Parameter-Efficient Fine-Tuning (PEFT) has emerged as a key approach to activate the generalizable representation ability of foundation models for downstream tasks. However, existing specialized PEFT methods often fail when applied to large-scale Earth observation tasks, as they are unable to fully handle the multifaceted and unpredictable domain gaps (\eg, spatial, semantic, and frequency shifts) inherent in RS data. To overcome this, we propose CrossEarth-Gate, which introduces two primary contributions. First, we establish a comprehensive RS module toolbox to address multifaceted domain gaps, comprising spatial, semantic, and frequency modules. Second, we develop a Fisher-guided adaptive selection mechanism that operates on this toolbox. This selection is guided by Fisher Information to quantify each module's importance by measuring its contribution to the task-specific gradient flow. It dynamically activates only the most critical modules at the appropriate layers, guiding the gradient flow to maximize adaptation effectiveness and efficiency. Comprehensive experiments validate the efficacy and generalizability of our method, where CrossEarth-Gate achieves state-of-the-art performance across 16 cross-domain benchmarks for RS semantic segmentation. The code of the work will be released.

CrossEarth-Gate: Fisher-Guided Adaptive Tuning Engine for Efficient Adaptation of Cross-Domain Remote Sensing Semantic Segmentation

TL;DR

CrossEarth-Gate tackles the challenge of activating large Geospatial Foundation Models for cross-domain RS semantic segmentation by introducing a structured Remote Sensing Module Toolbox and a Fisher Information–guided adaptive selection mechanism. The toolbox integrates spatial (LoRA), semantic (Adapter), and frequency (Earth-Adapter) modules inserted throughout the Transformer backbone, while the selection mechanism dynamically gates the most impactful modules to maximize task-specific gradient flow. Empirical results across 18 RS DG/DA benchmarks show state-of-the-art performance with up to 3.2 percentage points gains in mIoU and strong generalization across climate zones, disaster scenarios, and unlabeled target domains, all with minimal parameter updates. The work also provides thorough ablations and qualitative analyses, demonstrating the necessity of both the diversified toolbox and principled gradient gating for robust RS domain adaptation, and it announces code release for reproducibility and practical impact.

Abstract

In Remote Sensing (RS), Parameter-Efficient Fine-Tuning (PEFT) has emerged as a key approach to activate the generalizable representation ability of foundation models for downstream tasks. However, existing specialized PEFT methods often fail when applied to large-scale Earth observation tasks, as they are unable to fully handle the multifaceted and unpredictable domain gaps (\eg, spatial, semantic, and frequency shifts) inherent in RS data. To overcome this, we propose CrossEarth-Gate, which introduces two primary contributions. First, we establish a comprehensive RS module toolbox to address multifaceted domain gaps, comprising spatial, semantic, and frequency modules. Second, we develop a Fisher-guided adaptive selection mechanism that operates on this toolbox. This selection is guided by Fisher Information to quantify each module's importance by measuring its contribution to the task-specific gradient flow. It dynamically activates only the most critical modules at the appropriate layers, guiding the gradient flow to maximize adaptation effectiveness and efficiency. Comprehensive experiments validate the efficacy and generalizability of our method, where CrossEarth-Gate achieves state-of-the-art performance across 16 cross-domain benchmarks for RS semantic segmentation. The code of the work will be released.

Paper Structure

This paper contains 56 sections, 8 equations, 14 figures, 8 tables.

Figures (14)

  • Figure 1: Overview of the CrossEarth-Gate and its comparative advantages. (a) Existing PEFTs typically focus on one specific functional pathway (e.g., LoRA for spatial, Adaptformer for semantic, Earth-Adapter for frequency). The qualitative example of generalization across different climate zones shows each baseline failing on challenges outside its specialty, while our method succeeds. (b) Our proposed CrossEarth-Gate establishes a toolbox combining all three module types. Then, we utilize Fisher Information to guide the gradient flow to activate only the most critical modules at the most relevant blocks for a specific domain. (c) CrossEarth-Gate results in a superior Performance-Parameter Tradeoff. (d) CrossEarth-Gate achieves state-of-the-art results across 16 challenging DG and DA benchmarks.
  • Figure 2: Visualizations of predicted segmentation maps of PEFT methods. In the CASID liu2023large dataset, red is the road class, yellow is the building class, blue is the water class, green is the forest class, and black is the background class. In the RescueNet rahnemoonfar2023rescuenet dataset, white is the impervious surface class, red is the clutter class, blue is the building class, green is the vegetation class, and yellow is the car class.
  • Figure 3: Ablation study of model component on CASID benchmarks. We compare the performance and trainable parameters of CrossEarth-Gate against versions with key components removed.
  • Figure 4: Dynamic network analysis on the TemMs. (a) The vertical axis represents training steps (flowing downwards), and the bubble size corresponds to importance score. (b) Aggregated importance intensity of different module types across layers in the entire process.
  • Figure 5: Complete visualizations of predicted segmentation maps of PEFT methods. These samples are collected from the domain generalization benchmarks of Sub2Tms on the CASID liu2023large dataset, where red is the road class, yellow is the building class, blue is the water class, green is the forest class, and black is the background class.
  • ...and 9 more figures