Table of Contents
Fetching ...

Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation

Jintao Tong, Yixiong Zou, Yuhua Li, Ruixuan Li

TL;DR

This work discovers an intriguing phenomenon: simply filtering different frequency components for target domains can lead to a significant performance improvement, sometimes even as high as 14% mIoU, and proposes a lightweight frequency masker, which further reduces channel correlations by an Amplitude-Phase Masker (APM) module and an Adaptive Channel Phase Attention (ACPA) module.

Abstract

Cross-domain few-shot segmentation (CD-FSS) is proposed to first pre-train the model on a large-scale source-domain dataset, and then transfer the model to data-scarce target-domain datasets for pixel-level segmentation. The significant domain gap between the source and target datasets leads to a sharp decline in the performance of existing few-shot segmentation (FSS) methods in cross-domain scenarios. In this work, we discover an intriguing phenomenon: simply filtering different frequency components for target domains can lead to a significant performance improvement, sometimes even as high as 14% mIoU. Then, we delve into this phenomenon for an interpretation, and find such improvements stem from the reduced inter-channel correlation in feature maps, which benefits CD-FSS with enhanced robustness against domain gaps and larger activated regions for segmentation. Based on this, we propose a lightweight frequency masker, which further reduces channel correlations by an Amplitude-Phase Masker (APM) module and an Adaptive Channel Phase Attention (ACPA) module. Notably, APM introduces only 0.01% additional parameters but improves the average performance by over 10%, and ACPA imports only 2.5% parameters but further improves the performance by over 1.5%, which significantly surpasses the state-of-the-art CD-FSS methods.

Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation

TL;DR

This work discovers an intriguing phenomenon: simply filtering different frequency components for target domains can lead to a significant performance improvement, sometimes even as high as 14% mIoU, and proposes a lightweight frequency masker, which further reduces channel correlations by an Amplitude-Phase Masker (APM) module and an Adaptive Channel Phase Attention (ACPA) module.

Abstract

Cross-domain few-shot segmentation (CD-FSS) is proposed to first pre-train the model on a large-scale source-domain dataset, and then transfer the model to data-scarce target-domain datasets for pixel-level segmentation. The significant domain gap between the source and target datasets leads to a sharp decline in the performance of existing few-shot segmentation (FSS) methods in cross-domain scenarios. In this work, we discover an intriguing phenomenon: simply filtering different frequency components for target domains can lead to a significant performance improvement, sometimes even as high as 14% mIoU. Then, we delve into this phenomenon for an interpretation, and find such improvements stem from the reduced inter-channel correlation in feature maps, which benefits CD-FSS with enhanced robustness against domain gaps and larger activated regions for segmentation. Based on this, we propose a lightweight frequency masker, which further reduces channel correlations by an Amplitude-Phase Masker (APM) module and an Adaptive Channel Phase Attention (ACPA) module. Notably, APM introduces only 0.01% additional parameters but improves the average performance by over 10%, and ACPA imports only 2.5% parameters but further improves the performance by over 1.5%, which significantly surpasses the state-of-the-art CD-FSS methods.

Paper Structure

This paper contains 36 sections, 16 equations, 10 figures, 9 tables.

Figures (10)

  • Figure 1: For a model already trained on the source domain, we simply filter out different frequency components and plot mIoU against the maintained ones of images. $P$ denotes Phase, $A$ denotes Amplitude, $H$ denotes High Frequency, and $L$ denotes Low Frequency. We can see the performance is significantly improved in most cases compared with the baseline ($A_x$, $P_x$), even as high as 14% on the Chest X-ray dataset ($A^L_x$, $P_x$). In this paper, we delve into this phenomenon for an interpretation, and propose a lightweight frequency masker for efficient cross-domain few-shot segmentation.
  • Figure 2: Mean Magnitude of Channels (MMC) for the best case in Fig. \ref{['Fig.finding']} on four target datasets. For domains with improved performance, their curves are lower than the baseline after masking.
  • Figure 3: (a) After masking certain frequency components, the model's attention regions are enlarged with more patterns encompassed. (b) A higher concentration of phase differences at 0 and $\pi$ indicates a higher correlation, so that on FSS-1000 the performance drops but on Chest X-ray it increases.
  • Figure 4: Overview of our method in a 1-shot example. After obtaining the feature map, APM is introduced to adaptively filter certain frequency components based on different domains, facilitating feature disentanglement to achieve more generalizable representations. Additionally, we propose ACPA to encourage the model to focus on more effective features while aligning the feature space of the support and query images. The internal structure of APM and ACPA is highlighted in green.
  • Figure 5: Qualitative results of our model.
  • ...and 5 more figures