Table of Contents
Fetching ...

Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation

Jiahao Nie, Guanqiao Fu, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

TL;DR

The paper tackles Cross-Domain Few-Shot Segmentation (CD-FSS) under severe data scarcity and large domain gaps. It presents Multi-view Progressive Adaptation (MPA), which combines Hybrid Progressive Augmentation to create increasingly challenging augmented views and Dual-chain Multi-view Prediction to exploit them through sequential and parallel learning paths. The approach yields state-of-the-art performance across five data-scarce domains and remains effective in a source-free setting, with substantial efficiency gains. These results underscore the value of progressively adapting few-shot capability via both data complexity and predictive strategy in CD-FSS, offering a pathway to broader cross-domain segmentation tasks.

Abstract

Cross-Domain Few-Shot Segmentation aims to segment categories in data-scarce domains conditioned on a few exemplars. Typical methods first establish few-shot capability in a large-scale source domain and then adapt it to target domains. However, due to the limited quantity and diversity of target samples, existing methods still exhibit constrained performance. Moreover, the source-trained model's initially weak few-shot capability in target domains, coupled with substantial domain gaps, severely hinders the effective utilization of target samples and further impedes adaptation. To this end, we propose Multi-view Progressive Adaptation, which progressively adapts few-shot capability to target domains from both data and strategy perspectives. (i) From the data perspective, we introduce Hybrid Progressive Augmentation, which progressively generates more diverse and complex views through cumulative strong augmentations, thereby creating increasingly challenging learning scenarios. (ii) From the strategy perspective, we design Dual-chain Multi-view Prediction, which fully leverages these progressively complex views through sequential and parallel learning paths under extensive supervision. By jointly enforcing prediction consistency across diverse and complex views, MPA achieves both robust and accurate adaptation to target domains. Extensive experiments demonstrate that MPA effectively adapts few-shot capability to target domains, outperforming state-of-the-art methods by a large margin (+7.0%).

Cross-Domain Few-Shot Segmentation via Multi-view Progressive Adaptation

TL;DR

The paper tackles Cross-Domain Few-Shot Segmentation (CD-FSS) under severe data scarcity and large domain gaps. It presents Multi-view Progressive Adaptation (MPA), which combines Hybrid Progressive Augmentation to create increasingly challenging augmented views and Dual-chain Multi-view Prediction to exploit them through sequential and parallel learning paths. The approach yields state-of-the-art performance across five data-scarce domains and remains effective in a source-free setting, with substantial efficiency gains. These results underscore the value of progressively adapting few-shot capability via both data complexity and predictive strategy in CD-FSS, offering a pathway to broader cross-domain segmentation tasks.

Abstract

Cross-Domain Few-Shot Segmentation aims to segment categories in data-scarce domains conditioned on a few exemplars. Typical methods first establish few-shot capability in a large-scale source domain and then adapt it to target domains. However, due to the limited quantity and diversity of target samples, existing methods still exhibit constrained performance. Moreover, the source-trained model's initially weak few-shot capability in target domains, coupled with substantial domain gaps, severely hinders the effective utilization of target samples and further impedes adaptation. To this end, we propose Multi-view Progressive Adaptation, which progressively adapts few-shot capability to target domains from both data and strategy perspectives. (i) From the data perspective, we introduce Hybrid Progressive Augmentation, which progressively generates more diverse and complex views through cumulative strong augmentations, thereby creating increasingly challenging learning scenarios. (ii) From the strategy perspective, we design Dual-chain Multi-view Prediction, which fully leverages these progressively complex views through sequential and parallel learning paths under extensive supervision. By jointly enforcing prediction consistency across diverse and complex views, MPA achieves both robust and accurate adaptation to target domains. Extensive experiments demonstrate that MPA effectively adapts few-shot capability to target domains, outperforming state-of-the-art methods by a large margin (+7.0%).
Paper Structure (17 sections, 9 equations, 4 figures, 9 tables)

This paper contains 17 sections, 9 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Up: Simply incorporating multiple augmented views from the accessible target samples increases the sample available for establishing few-shot capability but yields only marginal gains, as the large domain gap limits effective utilization of heavily perturbed views. In contrast, our proposed Multi-View Progressive Adaptation (MPA) significantly improves the performance. Down: MPA adopts a progressive strategy to address the challenges posed by large domain gaps. Specifically, it starts with an easy task and progressively increases task complexity as the model becomes more capable during adaptation. This design enables a smooth adaptation of source-trained model and effectively establishes few-shot capability in the target domains.
  • Figure 2: The framework of the proposed Multi-view Progressive Adaptation (MPA). MPA starts with Hybrid Progressive Augmentation (HPA) as highlighted in the yellow box, which introduces two strategies to progressively increase the complexity and diversity of augmented query images. Leveraging the augmented views, MPA establishes few-shot capabilities in target domains via Dual-chain Multi-view Prediction (DMP), incorporating sequential and parallel chains as highlighted in the red box. Note $s$ and $q_i$ denote support and $i^{th}$ query images, $F_s$ and $F_{q_i}$ denote the support and the $i^{th}$ query features as extracted by the encoder, and $P_s$ denotes the support prototype.
  • Figure 3: Qualitative illustrations over five data-scarce domains, including Deepglobe, ISIC, Chest X-Ray, FSS-1000, and SUIM from up to down. Segmentation comparisons with state-of-the-art methods are in (a). For each pair of support image (ground truth highlighted by green) and query image (ground truth highlighted by blue), Column 3-5 show the corresponding segmentation heatmaps by PATNet lei2022cross, SSP fan2022self, and our MPA. Segmentation heatmaps of MPA over samples from five domains are shown in (b). Best viewed in color.
  • Figure 4: Ablation study on different adaptation strategies.