Table of Contents
Fetching ...

Robust SAM: On the Adversarial Robustness of Vision Foundation Models

Jiahuan Long, Zhengqin Xu, Tingsong Jiang, Wen Yao, Shuai Jia, Chao Ma, Xiaoqian Chen

TL;DR

This work investigates the adversarial robustness of vision foundation models, focusing on SAM and SAM 2. It introduces a cross-prompt adversarial attack that targets features common to both point and box prompts to improve transferability, and demonstrates its effectiveness in degrading segmentation performance across datasets. To counter these threats, the authors propose RobustSAM, a few-parameter defense that uses singular value decomposition to adapt only a diagonal matrix of the encoder’s convolutional layers, maintaining most of the original model structure. Empirically, the cross-prompt attack achieves higher attack success rates than prior methods, while RobustSAM delivers meaningful robustness gains with a minimal parameter budget (512 parameters) and minimal loss in clean performance. The results suggest that small, strategically chosen parameter adjustments can meaningfully bolster SAM’s robustness in practical deployments while keeping computational costs low.

Abstract

The Segment Anything Model (SAM) is a widely used vision foundation model with diverse applications, including image segmentation, detection, and tracking. Given SAM's wide applications, understanding its robustness against adversarial attacks is crucial for real-world deployment. However, research on SAM's robustness is still in its early stages. Existing attacks often overlook the role of prompts in evaluating SAM's robustness, and there has been insufficient exploration of defense methods to balance the robustness and accuracy. To address these gaps, this paper proposes an adversarial robustness framework designed to evaluate and enhance the robustness of SAM. Specifically, we introduce a cross-prompt attack method to enhance the attack transferability across different prompt types. Besides attacking, we propose a few-parameter adaptation strategy to defend SAM against various adversarial attacks. To balance robustness and accuracy, we use the singular value decomposition (SVD) to constrain the space of trainable parameters, where only singular values are adaptable. Experiments demonstrate that our cross-prompt attack method outperforms previous approaches in terms of attack success rate on both SAM and SAM 2. By adapting only 512 parameters, we achieve at least a 15\% improvement in mean intersection over union (mIoU) against various adversarial attacks. Compared to previous defense methods, our approach enhances the robustness of SAM while maximally maintaining its original performance.

Robust SAM: On the Adversarial Robustness of Vision Foundation Models

TL;DR

This work investigates the adversarial robustness of vision foundation models, focusing on SAM and SAM 2. It introduces a cross-prompt adversarial attack that targets features common to both point and box prompts to improve transferability, and demonstrates its effectiveness in degrading segmentation performance across datasets. To counter these threats, the authors propose RobustSAM, a few-parameter defense that uses singular value decomposition to adapt only a diagonal matrix of the encoder’s convolutional layers, maintaining most of the original model structure. Empirically, the cross-prompt attack achieves higher attack success rates than prior methods, while RobustSAM delivers meaningful robustness gains with a minimal parameter budget (512 parameters) and minimal loss in clean performance. The results suggest that small, strategically chosen parameter adjustments can meaningfully bolster SAM’s robustness in practical deployments while keeping computational costs low.

Abstract

The Segment Anything Model (SAM) is a widely used vision foundation model with diverse applications, including image segmentation, detection, and tracking. Given SAM's wide applications, understanding its robustness against adversarial attacks is crucial for real-world deployment. However, research on SAM's robustness is still in its early stages. Existing attacks often overlook the role of prompts in evaluating SAM's robustness, and there has been insufficient exploration of defense methods to balance the robustness and accuracy. To address these gaps, this paper proposes an adversarial robustness framework designed to evaluate and enhance the robustness of SAM. Specifically, we introduce a cross-prompt attack method to enhance the attack transferability across different prompt types. Besides attacking, we propose a few-parameter adaptation strategy to defend SAM against various adversarial attacks. To balance robustness and accuracy, we use the singular value decomposition (SVD) to constrain the space of trainable parameters, where only singular values are adaptable. Experiments demonstrate that our cross-prompt attack method outperforms previous approaches in terms of attack success rate on both SAM and SAM 2. By adapting only 512 parameters, we achieve at least a 15\% improvement in mean intersection over union (mIoU) against various adversarial attacks. Compared to previous defense methods, our approach enhances the robustness of SAM while maximally maintaining its original performance.

Paper Structure

This paper contains 23 sections, 10 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: A comparison of the adversarial robustness of the original SAM and the proposed Robust SAM. When implementing attacks, the original SAM predicts a segmentation mask with severely compromised precision (i.e. 0.26 mIoU with the ground truth), whereas our Robust SAM maximally preserves the quality of the segmentation mask (i.e. 0.98 mIoU with the ground truth) via modified noise distribution.
  • Figure 2: The overall framework of our proposed adversarial attack and defense pipelines. (a) illustrates one iteration of the cross-prompt adversarial attack. Original examples are perturbed by a noise generator, and these perturbed examples are then input into SAM, disrupting key features for both box and point prompts. The effectiveness and progression of the attack are demonstrated over multiple iterations. (b) depicts the adversarial defense process. Within the SAM model, the parameters of the convolutional layer (Conv) are decomposed via the singular value decomposition (SVD). Only the matrix $\mathbf{P}$ in the new parameter space is updated, while other parameters in the model remain frozen. After the parameter adaptation, the fine-tuned SAM is validated using adversarial examples to evaluate its robustness.
  • Figure 3: Demonstrations of output differences for point and box prompts. It illustrates the negative impact on SAM's output when the feature map of each individual channel is set to zero (i.e., $f_i(\mathbf{X}) = 0$). Red boxes highlight the key features that have the greatest impact on mIoU for the point and box prompts.
  • Figure 4: Effects of attack and defense on the intermediate features in SAM. Top to bottom: feature maps of (a) a clean image, (b) an image corrupted by adversarial noise, and (c) an image corrupted by adversarial noise but defended by our proposed defense method.
  • Figure 5: Illustrations of various adversarial attacks against SAM by employing both point and box prompts. An attack is considered successful if it removes more than 50% of the ground truth masks for either point or box prompts. (a) illustrates that the point prompt attacks effectively compromise the ground truth masks under point prompts but are ineffective when encountering box prompts. (b) depicts the box prompt attacks that successfully disrupt the ground truth masks with point prompts but fail to attack under point prompts. (c) showcases that our cross-prompt attack successfully degrades the performance for both point and box prompts simultaneously.
  • ...and 1 more figures