Table of Contents
Fetching ...

Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2

Lv Tang, Bo Li

TL;DR

The paper evaluates the progression from SAM to SAM2 using camouflaged object detection as a testbed, showing that SAM2 delivers substantial improvements in promptable segmentation and video applicability but underperforms in auto-mode object perception. Through COD datasets and standard metrics such as $S_\alpha$, $E_\phi$, $F_\beta$, $F^w_\beta$, and $F^{max}_\beta$ (with MAE as a complementary measure), the study reveals a clear trade-off: SAM2 excels when prompts guide segmentation yet struggles to autonomously detect multiple camouflaged objects. It also reports that SAM tends to generate many masks in auto mode, while SAM2 produces far fewer, underscoring a reliance on prompts. These findings illuminate the strengths and limitations of the SAM family, motivating future work to balance autonomous capabilities with prompt-driven performance. The associated results and code are available at the provided GitHub repository.

Abstract

The Segment Anything Model (SAM), introduced by Meta AI Research as a generic object segmentation model, quickly garnered widespread attention and significantly influenced the academic community. To extend its application to video, Meta further develops Segment Anything Model 2 (SAM2), a unified model capable of both video and image segmentation. SAM2 shows notable improvements over its predecessor in terms of applicable domains, promptable segmentation accuracy, and running speed. However, this report reveals a decline in SAM2's ability to perceive different objects in images without prompts in its auto mode, compared to SAM. Specifically, we employ the challenging task of camouflaged object detection to assess this performance decrease, hoping to inspire further exploration of the SAM model family by researchers. The results of this paper are provided in \url{https://github.com/luckybird1994/SAMCOD}.

Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2

TL;DR

The paper evaluates the progression from SAM to SAM2 using camouflaged object detection as a testbed, showing that SAM2 delivers substantial improvements in promptable segmentation and video applicability but underperforms in auto-mode object perception. Through COD datasets and standard metrics such as , , , , and (with MAE as a complementary measure), the study reveals a clear trade-off: SAM2 excels when prompts guide segmentation yet struggles to autonomously detect multiple camouflaged objects. It also reports that SAM tends to generate many masks in auto mode, while SAM2 produces far fewer, underscoring a reliance on prompts. These findings illuminate the strengths and limitations of the SAM family, motivating future work to balance autonomous capabilities with prompt-driven performance. The associated results and code are available at the provided GitHub repository.

Abstract

The Segment Anything Model (SAM), introduced by Meta AI Research as a generic object segmentation model, quickly garnered widespread attention and significantly influenced the academic community. To extend its application to video, Meta further develops Segment Anything Model 2 (SAM2), a unified model capable of both video and image segmentation. SAM2 shows notable improvements over its predecessor in terms of applicable domains, promptable segmentation accuracy, and running speed. However, this report reveals a decline in SAM2's ability to perceive different objects in images without prompts in its auto mode, compared to SAM. Specifically, we employ the challenging task of camouflaged object detection to assess this performance decrease, hoping to inspire further exploration of the SAM model family by researchers. The results of this paper are provided in \url{https://github.com/luckybird1994/SAMCOD}.
Paper Structure (6 sections, 1 figure, 4 tables)

This paper contains 6 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Masks predicted by SAM and SAM2.