Table of Contents
Fetching ...

Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples

Chenshuang Zhang, Chaoning Zhang, Taegoo Kang, Donghun Kim, Sung-Ho Bae, In So Kweon

TL;DR

This work introduces Attack-SAM, a framework to study adversarial perturbations on SAM's promptable mask predictions. It shows that SAM is vulnerable in full white-box settings to mask removal and can transfer attacks across prompts and tasks, with the ability to enlarge or manipulate masks and even generate any target mask. The authors propose ClipMSE as an effective loss, demonstrate strong white-box effectiveness, and explore transfer-based and beyond-mask capabilities, highlighting safety concerns for deploying SAM in security-sensitive contexts. The findings call for new robustness strategies to ensure reliable, prompt-driven segmentation in adversarial environments.

Abstract

Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, deep vision models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation. Such vulnerability to adversarial attacks causes serious concerns when applying deep models to security-sensitive applications. Therefore, it is critical to know whether the vision foundation model SAM can also be fooled by adversarial attacks. To the best of our knowledge, our work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples. With the basic attack goal set to mask removal, we investigate the adversarial robustness of SAM in the full white-box setting and transfer-based black-box settings. Beyond the basic goal of mask removal, we further investigate and find that it is possible to generate any desired mask by the adversarial attack.

Attack-SAM: Towards Attacking Segment Anything Model With Adversarial Examples

TL;DR

This work introduces Attack-SAM, a framework to study adversarial perturbations on SAM's promptable mask predictions. It shows that SAM is vulnerable in full white-box settings to mask removal and can transfer attacks across prompts and tasks, with the ability to enlarge or manipulate masks and even generate any target mask. The authors propose ClipMSE as an effective loss, demonstrate strong white-box effectiveness, and explore transfer-based and beyond-mask capabilities, highlighting safety concerns for deploying SAM in security-sensitive contexts. The findings call for new robustness strategies to ensure reliable, prompt-driven segmentation in adversarial environments.

Abstract

Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks in a zero-short manner. Computer vision (CV) area might follow the natural language processing (NLP) area to embark on a path from task-specific vision models toward foundation models. However, deep vision models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation. Such vulnerability to adversarial attacks causes serious concerns when applying deep models to security-sensitive applications. Therefore, it is critical to know whether the vision foundation model SAM can also be fooled by adversarial attacks. To the best of our knowledge, our work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples. With the basic attack goal set to mask removal, we investigate the adversarial robustness of SAM in the full white-box setting and transfer-based black-box settings. Beyond the basic goal of mask removal, we further investigate and find that it is possible to generate any desired mask by the adversarial attack.
Paper Structure (20 sections, 6 equations, 53 figures, 2 tables)

This paper contains 20 sections, 6 equations, 53 figures, 2 tables.

Figures (53)

  • Figure 1: $x_{clean}$
  • Figure 2: $x_{fgsm}$
  • Figure 3: $x_{pgd}$
  • Figure 4: $Mask_{clean}$
  • Figure 5: $Mask_{fgsm}$
  • ...and 48 more figures