Boosting Few-Shot Semantic Segmentation Via Segment Anything Model
Chen-Bin Feng, Qi Lai, Kangdao Liu, Houcheng Su, Chi-Man Vong
TL;DR
This work tackles contour inaccuracies in few-shot semantic segmentation by introducing a training-free post-processing pipeline that leverages the Segment Anything Model (SAM). By generating prompts from an initial FSS mask and refining the prediction with SAM, then using a Prediction Results Selection (PRS) step to exclude likely false SAM outputs, the method boosts segmentation quality without additional training. The approach is validated on Pascal-5^i and COCO-20^i, showing consistent improvements over strong baselines in both quantitative metrics (e.g., $mIoU$ and $FB ext{-}mIoU$) and qualitative edge/detail fidelity. The training-agnostic nature and plug-and-play design make it readily applicable to existing FSS systems, potentially expanding SAM’s utility in downstream tasks requiring precise mask contours.
Abstract
In semantic segmentation, accurate prediction masks are crucial for downstream tasks such as medical image analysis and image editing. Due to the lack of annotated data, few-shot semantic segmentation (FSS) performs poorly in predicting masks with precise contours. Recently, we have noticed that the large foundation model segment anything model (SAM) performs well in processing detailed features. Inspired by SAM, we propose FSS-SAM to boost FSS methods by addressing the issue of inaccurate contour. The FSS-SAM is training-free. It works as a post-processing tool for any FSS methods and can improve the accuracy of predicted masks. Specifically, we use predicted masks from FSS methods to generate prompts and then use SAM to predict new masks. To avoid predicting wrong masks with SAM, we propose a prediction result selection (PRS) algorithm. The algorithm can remarkably decrease wrong predictions. Experiment results on public datasets show that our method is superior to base FSS methods in both quantitative and qualitative aspects.
