Table of Contents
Fetching ...

Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

Jiapeng Su, Qi Fan, Guangming Lu, Fanglin Chen, Wenjie Pei

TL;DR

The paper tackles cross-domain few-shot segmentation by decoupling domain adaptation from source-domain model training: it introduces a compact domain-rectifying adapter that aligns diverse target-domain feature styles with the source domain. To train this adapter, it synthesizes diverse target-domain styles through local and global perturbations of feature channel statistics and enforces robust rectification via a cyclic domain alignment loss. Empirical results on CD-FSS benchmarks show substantial gains over traditional few-shot methods and domain-transfer baselines, including notable improvements on Chest X-ray and DeepGlobe datasets, and further gains when extending to transformers. This approach reduces overfitting risk in data-scarce few-shot settings and provides a practical mechanism to leverage strong source-domain models for cross-domain segmentation tasks.

Abstract

Few-shot semantic segmentation (FSS) has achieved great success on segmenting objects of novel classes, supported by only a few annotated samples. However, existing FSS methods often underperform in the presence of domain shifts, especially when encountering new domain styles that are unseen during training. It is suboptimal to directly adapt or generalize the entire model to new domains in the few-shot scenario. Instead, our key idea is to adapt a small adapter for rectifying diverse target domain styles to the source domain. Consequently, the rectified target domain features can fittingly benefit from the well-optimized source domain segmentation model, which is intently trained on sufficient source domain data. Training domain-rectifying adapter requires sufficiently diverse target domains. We thus propose a novel local-global style perturbation method to simulate diverse potential target domains by perturbating the feature channel statistics of the individual images and collective statistics of the entire source domain, respectively. Additionally, we propose a cyclic domain alignment module to facilitate the adapter effectively rectifying domains using a reverse domain rectification supervision. The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain. During testing on target domains, we start by rectifying the image features and then conduct few-shot segmentation on the domain-rectified features. Extensive experiments demonstrate the effectiveness of our method, achieving promising results on cross-domain few-shot semantic segmentation tasks. Our code is available at https://github.com/Matt-Su/DR-Adapter.

Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

TL;DR

The paper tackles cross-domain few-shot segmentation by decoupling domain adaptation from source-domain model training: it introduces a compact domain-rectifying adapter that aligns diverse target-domain feature styles with the source domain. To train this adapter, it synthesizes diverse target-domain styles through local and global perturbations of feature channel statistics and enforces robust rectification via a cyclic domain alignment loss. Empirical results on CD-FSS benchmarks show substantial gains over traditional few-shot methods and domain-transfer baselines, including notable improvements on Chest X-ray and DeepGlobe datasets, and further gains when extending to transformers. This approach reduces overfitting risk in data-scarce few-shot settings and provides a practical mechanism to leverage strong source-domain models for cross-domain segmentation tasks.

Abstract

Few-shot semantic segmentation (FSS) has achieved great success on segmenting objects of novel classes, supported by only a few annotated samples. However, existing FSS methods often underperform in the presence of domain shifts, especially when encountering new domain styles that are unseen during training. It is suboptimal to directly adapt or generalize the entire model to new domains in the few-shot scenario. Instead, our key idea is to adapt a small adapter for rectifying diverse target domain styles to the source domain. Consequently, the rectified target domain features can fittingly benefit from the well-optimized source domain segmentation model, which is intently trained on sufficient source domain data. Training domain-rectifying adapter requires sufficiently diverse target domains. We thus propose a novel local-global style perturbation method to simulate diverse potential target domains by perturbating the feature channel statistics of the individual images and collective statistics of the entire source domain, respectively. Additionally, we propose a cyclic domain alignment module to facilitate the adapter effectively rectifying domains using a reverse domain rectification supervision. The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain. During testing on target domains, we start by rectifying the image features and then conduct few-shot segmentation on the domain-rectified features. Extensive experiments demonstrate the effectiveness of our method, achieving promising results on cross-domain few-shot semantic segmentation tasks. Our code is available at https://github.com/Matt-Su/DR-Adapter.
Paper Structure (20 sections, 11 equations, 7 figures, 10 tables)

This paper contains 20 sections, 11 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: The comparison of our method with other approaches. (a) Traditional few-shot segmentation (FSS) methods train and test the model on the same domain. (b) Most domain generalization (DG) methods leverages multiple source domains to train and adapt the large-parameter model simultaneously. (c) In contrast to conventional DG methods, we propose using a lightweight adapter as a substitute. This adapter is designed to adapt to various domain data, thereby decoupling domain adaptation from the source domain training process.
  • Figure 2: We show the feature channel statistic of an individual sample's statistic and the average statistic across the dataset on the pretrained backbone at stage 1. The average statistics exhibit a smoother profile compared to that of an individual sample, allowing for the application of more substantial noise to the feature with the smoother statistics.
  • Figure 3: Overview of our cross-domain few-shot segmentation approach. Our method consists of two modules: a feature perturbation module and a feature rectification module. The former is used to generate simulated domain features, while the latter trains the adapter by restoring the features to their original states. During the perturbation process, we employ both local and global perturbations, controlled by two different probabilities $P$ to decide if a feature is perturbed. Note that when both probabilities exceed 0.5, the entire backbone undergoes standard training. During testing, we treat target domain features as perturbed features and directly rectify them using the adapter.
  • Figure 4: The process of cycle alignment, where 'P' denotes perturbation and 'R' stands for rectification.
  • Figure 5: Qualitative results of our model and baseline in 1-way 1-shot setting on challenging scenarios with large domain gap.
  • ...and 2 more figures