Black-box Targeted Adversarial Attack on Segment Anything (SAM)

Sheng Zheng; Chaoning Zhang; Xinhong Hao

Black-box Targeted Adversarial Attack on Segment Anything (SAM)

Sheng Zheng, Chaoning Zhang, Xinhong Hao

TL;DR

The paper addresses the vulnerability of Segment Anything Model (SAM) to targeted adversarial attacks in a practical black-box setting. It introduces a prompt-agnostic attack that targets only the image encoder (PATA), and further enhances cross-model transferability with regularized variants (PATA+ and PATA++) that increase the feature dominance of adversarial inputs. Through extensive experiments, PATA++ achieves notable cross-model transferability across SAM variants, highlighting the need for defense mechanisms to ensure robust segmentation under adversarial conditions. The work advances understanding of SAM robustness and provides practical strategies for evaluating and improving attack transferability in real-world, prompt-flexible usage scenarios.

Abstract

Deep recognition models are widely vulnerable to adversarial examples, which change the model output by adding quasi-imperceptible perturbation to the image input. Recently, Segment Anything Model (SAM) has emerged to become a popular foundation model in computer vision due to its impressive generalization to unseen data and tasks. Realizing flexible attacks on SAM is beneficial for understanding the robustness of SAM in the adversarial context. To this end, this work aims to achieve a targeted adversarial attack (TAA) on SAM. Specifically, under a certain prompt, the goal is to make the predicted mask of an adversarial example resemble that of a given target image. The task of TAA on SAM has been realized in a recent arXiv work in the white-box setup by assuming access to prompt and model, which is thus less practical. To address the issue of prompt dependence, we propose a simple yet effective approach by only attacking the image encoder. Moreover, we propose a novel regularization loss to enhance the cross-model transferability by increasing the feature dominance of adversarial images over random natural images. Extensive experiments verify the effectiveness of our proposed simple techniques to conduct a successful black-box TAA on SAM.

Black-box Targeted Adversarial Attack on Segment Anything (SAM)

TL;DR

Abstract

Paper Structure (16 sections, 7 equations, 6 figures, 5 tables)

This paper contains 16 sections, 7 equations, 6 figures, 5 tables.

Introduction
Related Work
Adversarial Attacks
Segment Anything Model (SAM)
Background and Task
SAM Structure and Work Mechanism
Targeted Adversarial Attack (TAA) on SAM
Evaluation Under PGD-based Attack
Boosting Cross-prompt Transferability
Cross-prompt Attack-SAM
Our Proposed Method
Boosting Cross-model Transferability
Relative Feature Strength
Cross-model Results
Conclusion
...and 1 more sections

Figures (6)

Figure 1: The process of Segment Anything Model.
Figure 2: The mIoU results of cross-prompt Attack-SAM for training and testing with increasing training prompt points.
Figure 3: Feature dominance of PATA, PATA+, and PATA++.
Figure 4: Cross-model results of PATA++ under point prompts.
Figure 5: Cross-model results of PATA++ under box prompts.
...and 1 more figures

Black-box Targeted Adversarial Attack on Segment Anything (SAM)

TL;DR

Abstract

Black-box Targeted Adversarial Attack on Segment Anything (SAM)

Authors

TL;DR

Abstract

Table of Contents

Figures (6)