Table of Contents
Fetching ...

SAM-IF: Leveraging SAM for Incremental Few-Shot Instance Segmentation

Xudong Zhou, Wenhao He

TL;DR

SAM-IF presents a novel approach to class-agnostic instance segmentation by fine-tuning SAM2 with a multi-class background-aware classifier and a cosine-similarity-based head for few-shot adaptation. It enables incremental learning by updating only classifier weights for novel categories, avoiding decoder retraining. Evaluated on COCO2014 with a 60/20 base/novel split in a 1-shot setting, the method achieves competitive generalization relative to prior FSIS approaches, while addressing background suppression and foreground focus through erosion and targeted sampling. The work demonstrates practical implications for dynamic environments where new object categories must be incorporated with minimal labeled data and computational overhead. Future work targets richer prompts, optimized classifier training, and reduced reliance on SAM embeddings to further boost robustness and accuracy in challenging scenes.

Abstract

We propose SAM-IF, a novel method for incremental few-shot instance segmentation leveraging the Segment Anything Model (SAM). SAM-IF addresses the challenges of class-agnostic instance segmentation by introducing a multi-class classifier and fine-tuning SAM to focus on specific target objects. To enhance few-shot learning capabilities, SAM-IF employs a cosine-similarity-based classifier, enabling efficient adaptation to novel classes with minimal data. Additionally, SAM-IF supports incremental learning by updating classifier weights without retraining the decoder. Our method achieves competitive but more reasonable results compared to existing approaches, particularly in scenarios requiring specific object segmentation with limited labeled data.

SAM-IF: Leveraging SAM for Incremental Few-Shot Instance Segmentation

TL;DR

SAM-IF presents a novel approach to class-agnostic instance segmentation by fine-tuning SAM2 with a multi-class background-aware classifier and a cosine-similarity-based head for few-shot adaptation. It enables incremental learning by updating only classifier weights for novel categories, avoiding decoder retraining. Evaluated on COCO2014 with a 60/20 base/novel split in a 1-shot setting, the method achieves competitive generalization relative to prior FSIS approaches, while addressing background suppression and foreground focus through erosion and targeted sampling. The work demonstrates practical implications for dynamic environments where new object categories must be incorporated with minimal labeled data and computational overhead. Future work targets richer prompts, optimized classifier training, and reduced reliance on SAM embeddings to further boost robustness and accuracy in challenging scenes.

Abstract

We propose SAM-IF, a novel method for incremental few-shot instance segmentation leveraging the Segment Anything Model (SAM). SAM-IF addresses the challenges of class-agnostic instance segmentation by introducing a multi-class classifier and fine-tuning SAM to focus on specific target objects. To enhance few-shot learning capabilities, SAM-IF employs a cosine-similarity-based classifier, enabling efficient adaptation to novel classes with minimal data. Additionally, SAM-IF supports incremental learning by updating classifier weights without retraining the decoder. Our method achieves competitive but more reasonable results compared to existing approaches, particularly in scenarios requiring specific object segmentation with limited labeled data.

Paper Structure

This paper contains 25 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The architecture of SAM-IF.
  • Figure 2: Construction of Class Weights for Novel Categories.
  • Figure 3: Segmentation Results. Ideal, Moderate, and Poor. The ideal result shows accurate segmentation with a clear subject and minimal background clutter. The moderate result correctly classifies the subject but includes some irrelevant segmentation. The poor result fails to segment the subject, and many small background objects are incorrectly segmented. The numbers on the left represent the class IDs, while the numbers on the right of Pred Mask indicate the confidence scores.
  • Figure 4: Analysis of low AP50 caused by missing annotations in COCO and SAM's fragmented segmentation