Table of Contents
Fetching ...

COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection

Xiaoqin Zhang, Zhenni Yu, Li Zhao, Deng-Ping Fan, Guobao Xiao

TL;DR

COMPrompter reframes Segment Anything Model for camouflaged object detection by introducing a multiprompt framework that combines a traditional box prompt with a boundary prompt derived from edge gradients. It introduces two novel components, the edge gradient extraction module (EGEM) and the box-boundary mutual guidance (BBMG), and augments the representation with high-frequency cues via discrete wavelet transform (DWT). Through extensive experiments on COD benchmarks (COD10K, CAMO, NC4K) and polyp segmentation datasets, it demonstrates state-of-the-art performance and strong generalization. The work highlights the potential of boundary-aware prompts to enhance foundation models for specialized tasks.

Abstract

We rethink the segment anything model (SAM) and propose a novel multiprompt network called COMPrompter for camouflaged object detection (COD). SAM has zero-shot generalization ability beyond other models and can provide an ideal framework for COD. Our network aims to enhance the single prompt strategy in SAM to a multiprompt strategy. To achieve this, we propose an edge gradient extraction module, which generates a mask containing gradient information regarding the boundaries of camouflaged objects. This gradient mask is then used as a novel boundary prompt, enhancing the segmentation process. Thereafter, we design a box-boundary mutual guidance module, which fosters more precise and comprehensive feature extraction via mutual guidance between a boundary prompt and a box prompt. This collaboration enhances the model's ability to accurately detect camouflaged objects. Moreover, we employ the discrete wavelet transform to extract high-frequency features from image embeddings. The high-frequency features serve as a supplementary component to the multiprompt system. Finally, our COMPrompter guides the network to achieve enhanced segmentation results, thereby advancing the development of SAM in terms of COD. Experimental results across COD benchmarks demonstrate that COMPrompter achieves a cutting-edge performance, surpassing the current leading model by an average positive metric of 2.2% in COD10K. In the specific application of COD, the experimental results in polyp segmentation show that our model is superior to top-tier methods as well. The code will be made available at https://github.com/guobaoxiao/COMPrompter.

COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection

TL;DR

COMPrompter reframes Segment Anything Model for camouflaged object detection by introducing a multiprompt framework that combines a traditional box prompt with a boundary prompt derived from edge gradients. It introduces two novel components, the edge gradient extraction module (EGEM) and the box-boundary mutual guidance (BBMG), and augments the representation with high-frequency cues via discrete wavelet transform (DWT). Through extensive experiments on COD benchmarks (COD10K, CAMO, NC4K) and polyp segmentation datasets, it demonstrates state-of-the-art performance and strong generalization. The work highlights the potential of boundary-aware prompts to enhance foundation models for specialized tasks.

Abstract

We rethink the segment anything model (SAM) and propose a novel multiprompt network called COMPrompter for camouflaged object detection (COD). SAM has zero-shot generalization ability beyond other models and can provide an ideal framework for COD. Our network aims to enhance the single prompt strategy in SAM to a multiprompt strategy. To achieve this, we propose an edge gradient extraction module, which generates a mask containing gradient information regarding the boundaries of camouflaged objects. This gradient mask is then used as a novel boundary prompt, enhancing the segmentation process. Thereafter, we design a box-boundary mutual guidance module, which fosters more precise and comprehensive feature extraction via mutual guidance between a boundary prompt and a box prompt. This collaboration enhances the model's ability to accurately detect camouflaged objects. Moreover, we employ the discrete wavelet transform to extract high-frequency features from image embeddings. The high-frequency features serve as a supplementary component to the multiprompt system. Finally, our COMPrompter guides the network to achieve enhanced segmentation results, thereby advancing the development of SAM in terms of COD. Experimental results across COD benchmarks demonstrate that COMPrompter achieves a cutting-edge performance, surpassing the current leading model by an average positive metric of 2.2% in COD10K. In the specific application of COD, the experimental results in polyp segmentation show that our model is superior to top-tier methods as well. The code will be made available at https://github.com/guobaoxiao/COMPrompter.

Paper Structure

This paper contains 16 sections, 5 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Scatter plot representing the performance of competitors and our model on COD10K-Test.$F_\beta^\omega$, $S_\alpha$, and $E_\phi$ are positive-oriented, while $M$ is negative-oriented. The order of magnitude of $M$ and the other indices are different. For a more effective comparison, we take $M$ as the X-axis, and the sum of the other three indicators as the Y-axis. The underline represents the segment anything model (SAM)-based method. (Score = $F_\beta^\omega$ + $S_\alpha$ + $E_\phi$).
  • Figure 2: Pipeline of our COMPrompter framework (left) and details of the box-boundary mutual guidance module (BBMG) (right). Regarding the modules in SAM, the parameters in the module with a snowflake are fixed, while whose in the module with a spark can be optimized via training. Purple arrows represent image processing, while blue ones represent the processing of boundary prompts. The dashed arrow represents loss calculation. To decrease the amount of computation, we have calculated in advance the part in the left dashed box.
  • Figure 3: The overview of SAM Kirillov_2023_ICCV with box prompt.
  • Figure 4: EGEM details. The top half of the figure represents the process of object edge extraction. The bottom half represents the process of extracting the gradient of the whole image. Finally the two images are multiplied to obtain the edge map containing the gradient.
  • Figure 5: Comparison of our COMPrompter and other methods, including MedSAM ma2023segment and SAM Kirillov_2023_ICCV, in terms of COD. Columns 1--3 are for the CAMO dataset, and Columns 4--6 are fromfor the COD10K dataset.
  • ...and 6 more figures