DarkSAM: Fooling Segment Anything Model to Segment Nothing
Ziqi Zhou, Yufei Song, Minghui Li, Shengshan Hu, Xianlong Wang, Leo Yu Zhang, Dezhong Yao, Hai Jin
TL;DR
This work identifies a critical vulnerability in the Segment Anything Model by introducing DarkSAM, the first truly prompt-free universal adversarial perturbation attacking SAM across prompts and images. It combines a semantic blueprint–driven shadow target strategy with a hybrid spatial-frequency attack to disrupt both foreground and background features and texture information, achieving high attack success and cross-model transferability. The approach demonstrates strong performance on multiple datasets and against several SAM variants, highlighting the need for defenses against prompt-free UAPs in foundation vision models. The findings have significant implications for the security of SAM-based pipelines and underscore the value of robust defenses that address both spatial semantics and texture cues.
Abstract
Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversarial perturbation (UAP) have not been thoroughly investigated yet. In this paper, we propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic decoupling-based spatial attack and a texture distortion-based frequency attack. We first divide the output of SAM into foreground and background. Then, we design a shadow target strategy to obtain the semantic blueprint of the image as the attack target. DarkSAM is dedicated to fooling SAM by extracting and destroying crucial object features from images in both spatial and frequency domains. In the spatial domain, we disrupt the semantics of both the foreground and background in the image to confuse SAM. In the frequency domain, we further enhance the attack effectiveness by distorting the high-frequency components (i.e., texture information) of the image. Consequently, with a single UAP, DarkSAM renders SAM incapable of segmenting objects across diverse images with varying prompts. Experimental results on four datasets for SAM and its two variant models demonstrate the powerful attack capability and transferability of DarkSAM.
