Practical Region-level Attack against Segment Anything Models
Yifan Shen, Zhengyuan Li, Gang Wang
TL;DR
This work addresses the practical robustness of Segment Anything Models (SAM) by introducing region-level adversarial attacks that do not require knowledge of the exact user prompt. It presents Sampling-based Region Attack (S-RA) and Transferable Region Attack (T-RA), with T-RA leveraging Spectrum Transformation to improve black-box transferability across SAM variants. Extensive experiments across ViT-B/H/L backbones and multiple SAM variants (EfficientSAM, Fast-SAM, MobileSAM, HQ-SAM) show that T-RA can drastically reduce segmentation performance (mean IoU $<0.10$ in many black-box settings) and even degrade performance on a real-world SAM service, underscoring practical security concerns. The results motivate defenses such as adversarial training, input transformations, and more robust architectures, and suggest extending region-level attacks to other prompts and segmentation models for future work.
Abstract
Segment Anything Models (SAM) have made significant advancements in image segmentation, allowing users to segment target portions of an image with a single click (i.e., user prompt). Given its broad applications, the robustness of SAM against adversarial attacks is a critical concern. While recent works have explored adversarial attacks against a pre-defined prompt/click, their threat model is not yet realistic: (1) they often assume the user-click position is known to the attacker (point-based attack), and (2) they often operate under a white-box setting with limited transferability. In this paper, we propose a more practical region-level attack where attackers do not need to know the precise user prompt. The attack remains effective as the user clicks on any point on the target object in the image, hiding the object from SAM. Also, by adapting a spectrum transformation method, we make the attack more transferable under a black-box setting. Both control experiments and testing against real-world SAM services confirm its effectiveness.
