Table of Contents
Fetching ...

Few-Shot Segmentation with Global and Local Contrastive Learning

Weide Liu, Zhonghua Wu, Henghui Ding, Fayao Liu, Jie Lin, Guosheng Lin, Wei Zhou

TL;DR

This work tackles few-shot segmentation by decoupling query information from the support guidance and extracting priors directly from unlabeled query images using a global-local contrastive learning framework. A prior extractor produces query priors, which, together with a cross-correspondence module that fuses support-guided cues, enables effective query-mask prediction. The approach yields state-of-the-art results on both PASCAL-5i and MS COCO and is supported by extensive ablations showing the benefits of local patch-based contrastive learning and SLIC-based patch generation. The method offers practical impact by reducing reliance on labeled support and enhancing generalization to novel classes in segmentation tasks.

Abstract

In this work, we address the challenging task of few-shot segmentation. Previous few-shot segmentation methods mainly employ the information of support images as guidance for query image segmentation. Although some works propose to build cross-reference between support and query images, their extraction of query information still depends on the support images. We here propose to extract the information from the query itself independently to benefit the few-shot segmentation task. To this end, we first propose a prior extractor to learn the query information from the unlabeled images with our proposed global-local contrastive learning. Then, we extract a set of predetermined priors via this prior extractor. With the obtained priors, we generate the prior region maps for query images, which locate the objects, as guidance to perform cross interaction with support features. In such a way, the extraction of query information is detached from the support branch, overcoming the limitation by support, and could obtain more informative query clues to achieve better interaction. Without bells and whistles, the proposed approach achieves new state-of-the-art performance for the few-shot segmentation task on PASCAL-5$^{i}$ and COCO datasets.

Few-Shot Segmentation with Global and Local Contrastive Learning

TL;DR

This work tackles few-shot segmentation by decoupling query information from the support guidance and extracting priors directly from unlabeled query images using a global-local contrastive learning framework. A prior extractor produces query priors, which, together with a cross-correspondence module that fuses support-guided cues, enables effective query-mask prediction. The approach yields state-of-the-art results on both PASCAL-5i and MS COCO and is supported by extensive ablations showing the benefits of local patch-based contrastive learning and SLIC-based patch generation. The method offers practical impact by reducing reliance on labeled support and enhancing generalization to novel classes in segmentation tasks.

Abstract

In this work, we address the challenging task of few-shot segmentation. Previous few-shot segmentation methods mainly employ the information of support images as guidance for query image segmentation. Although some works propose to build cross-reference between support and query images, their extraction of query information still depends on the support images. We here propose to extract the information from the query itself independently to benefit the few-shot segmentation task. To this end, we first propose a prior extractor to learn the query information from the unlabeled images with our proposed global-local contrastive learning. Then, we extract a set of predetermined priors via this prior extractor. With the obtained priors, we generate the prior region maps for query images, which locate the objects, as guidance to perform cross interaction with support features. In such a way, the extraction of query information is detached from the support branch, overcoming the limitation by support, and could obtain more informative query clues to achieve better interaction. Without bells and whistles, the proposed approach achieves new state-of-the-art performance for the few-shot segmentation task on PASCAL-5 and COCO datasets.

Paper Structure

This paper contains 21 sections, 15 equations, 7 figures, 7 tables, 2 algorithms.

Figures (7)

  • Figure 1: Comparison between the pipeline of our proposed Query Guided Network with previous state-of-the-art (SOTA) few-shot segmentation methods. Previous works (upper part) only employ support images' information as guidance for query mask estimation, while our QGNet (lower part) utilizes the clues of query images with query extractor as guidance for final query mask prediction.
  • Figure 2: Our proposed method consists of a self-correspondence module and a cross-correspondence module. Unlike the previous SOTA, a self-correspondence module (green) is proposed to extract prior features and generate a prior region map to locate the target object regions from the query image itself. A Cross-correspondence module (blue) is proposed to generate a guided region map to identify the query object region with the category guide from the masked support feature. Finally, the prior region map and guided region map are concatenated with category and bridge features for the final query mask prediction (FEM).
  • Figure 3: The difference between conventional contrastive learning and our proposed global-local contrastive learning. Conventional contrastive learning methods (left) only learn contrast from a global perspective. Our global-local contrastive learning learns the contrast with two additional local patches as input and builds a contrastive loss across the global and local representation.
  • Figure 4: Visualization of the local patches generated by Felzenszwalb's method and Slic.
  • Figure 5: Visualization results for guided region map, prior region map, and the query prediction generated by our proposed QGNet on PASCAL-$5^i$ dataset.
  • ...and 2 more figures