Black-box Targeted Adversarial Attack on Segment Anything (SAM)
Sheng Zheng, Chaoning Zhang, Xinhong Hao
TL;DR
The paper addresses the vulnerability of Segment Anything Model (SAM) to targeted adversarial attacks in a practical black-box setting. It introduces a prompt-agnostic attack that targets only the image encoder (PATA), and further enhances cross-model transferability with regularized variants (PATA+ and PATA++) that increase the feature dominance of adversarial inputs. Through extensive experiments, PATA++ achieves notable cross-model transferability across SAM variants, highlighting the need for defense mechanisms to ensure robust segmentation under adversarial conditions. The work advances understanding of SAM robustness and provides practical strategies for evaluating and improving attack transferability in real-world, prompt-flexible usage scenarios.
Abstract
Deep recognition models are widely vulnerable to adversarial examples, which change the model output by adding quasi-imperceptible perturbation to the image input. Recently, Segment Anything Model (SAM) has emerged to become a popular foundation model in computer vision due to its impressive generalization to unseen data and tasks. Realizing flexible attacks on SAM is beneficial for understanding the robustness of SAM in the adversarial context. To this end, this work aims to achieve a targeted adversarial attack (TAA) on SAM. Specifically, under a certain prompt, the goal is to make the predicted mask of an adversarial example resemble that of a given target image. The task of TAA on SAM has been realized in a recent arXiv work in the white-box setup by assuming access to prompt and model, which is thus less practical. To address the issue of prompt dependence, we propose a simple yet effective approach by only attacking the image encoder. Moreover, we propose a novel regularization loss to enhance the cross-model transferability by increasing the feature dominance of adversarial images over random natural images. Extensive experiments verify the effectiveness of our proposed simple techniques to conduct a successful black-box TAA on SAM.
