Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

Xiaoxing Hu; Ziyang Gong; Yupei Wang; Yuru Jia; Fei Lin; Dexiang Gao; Ke An; Jianhong Han; Zhuoran Sun; Gen Luo; Gen Luo; Xue Yang

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

Xiaoxing Hu, Ziyang Gong, Yupei Wang, Yuru Jia, Fei Lin, Dexiang Gao, Ke An, Jianhong Han, Zhuoran Sun, Gen Luo, Gen Luo, Xue Yang

TL;DR

Earth-Adapter addresses artifact-induced degradation in RS feature representations when applying PEFT to RS semantic segmentation. It introduces a frequency-guided Mixture of Adapters (MoA) that splits features into low and high frequency components via Discrete Fourier Transform and a dynamic router to adaptively fuse the adapters, all while keeping the backbone VFMs frozen. Across SS, DA, and DG tasks, it achieves SOTA results on 12 RS benchmarks with notable DA gains and robust generalization improvements, demonstrating effective artifact suppression and feature denoising in RS imagery. The work offers practical benefits for deploying large Vision Foundation Models in RS applications with limited fine-tuning, and provides extensive analyses on adapter configurations, frequency cutoffs, and layer choices to guide future RS PEFT design.

Abstract

Parameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to adapt powerful Foundation Models (FMs) to diverse downstream tasks while preserving and unleashing their inherent capabilities. However, we have observed that existing PEFT methods, which are often designed with natural imagery in mind, struggle when applied to Remote Sensing (RS) scenarios. This is primarily due to their inability to handle artifact influences, a problem particularly severe in RS image features. To tackle this challenge, we introduce Earth-Adapter, the first PEFT method specifically designed for RS artifacts conquering. Earth-Adapter introduces a novel Mixture of Frequency Adaptation process that combines a Mixture of Adapter (MoA) with Discrete Fourier Transformation (DFT). By utilizing DFT, Earth-Adapter can decompose features into different frequency components, precisely separating artifacts from original features. The MoA then dynamically assigns weights to each adapter expert, allowing for the combination of features across various frequency domains. These simple-yet-effective approaches enable Earth-Adapter to more efficiently overcome the disturbances caused by artifacts than previous PEFT methods, significantly enhancing the FMs' performance on RS scenarios. Experiments on Domain Adaptation (DA), and Domain Generalization (DG) semantic segmentation benchmarks showcase the Earth-Adapter's effectiveness. Compared with baseline Rein, Earth-Adapter significantly improves 9.0% mIoU in DA and 3.1% mIoU in DG benchmarks. Our code will be released at https://github.com/VisionXLab/Earth-Adapter.

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

TL;DR

Abstract

Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)