AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation

Jiaqi Ma; Guo-Sen Xie; Fang Zhao; Zechao Li

AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation

Jiaqi Ma, Guo-Sen Xie, Fang Zhao, Zechao Li

TL;DR

AFANet addresses the challenge of weakly-supervised few-shot semantic segmentation by leveraging frequency-domain information and online cross-modal guidance. The Cross-Granularity Frequency-Aware Module decouples RGB features into high- and low-frequency components across a pyramid backbone and realigns them to enrich semantic structure, while the CLIP-Guided Spatial-Adapter Module online-tunes CLIP’s textual priors to the downstream task distribution. Together, these components provide stronger semantic guidance under scarce annotations and enable robust pseudo-masks for support and query images. On Pascal-5i and COCO-20i, AFANet achieves state-of-the-art results, demonstrating the benefits of integrating frequency-domain cues with online CLIP adaptation for WFSS.

Abstract

Few-shot learning aims to recognize novel concepts by leveraging prior knowledge learned from a few samples. However, for visually intensive tasks such as few-shot semantic segmentation, pixel-level annotations are time-consuming and costly. Therefore, in this paper, we utilize the more challenging image-level annotations and propose an adaptive frequency-aware network (AFANet) for weakly-supervised few-shot semantic segmentation (WFSS). Specifically, we first propose a cross-granularity frequency-aware module (CFM) that decouples RGB images into high-frequency and low-frequency distributions and further optimizes semantic structural information by realigning them. Unlike most existing WFSS methods using the textual information from the multi-modal language-vision model, e.g., CLIP, in an offline learning manner, we further propose a CLIP-guided spatial-adapter module (CSM), which performs spatial domain adaptive transformation on textual information through online learning, thus providing enriched cross-modal semantic information for CFM. Extensive experiments on the Pascal-5\textsuperscript{i} and COCO-20\textsuperscript{i} datasets demonstrate that AFANet has achieved state-of-the-art performance. The code is available at https://github.com/jarch-ma/AFANet.

AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation

TL;DR

Abstract

AFANet: Adaptive Frequency-Aware Network for Weakly-Supervised Few-Shot Semantic Segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)