Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2
Ziqi Zhou, Yifan Hu, Yufei Song, Zijing Li, Shengshan Hu, Leo Yu Zhang, Dezhong Yao, Long Zheng, Hai Jin
TL;DR
Vanish into Thin Air investigates the robustness of SAM2 for video segmentation by identifying two key vulnerabilities: directional prompt guidance and semantic entanglement across frames. It introduces UAP-SAM2, a cross-prompt universal adversarial attack driven by dual semantic deviation, combining a target-scanning prompt-diversification strategy with three attack components: semantic confusion, feature shift, and memory misalignment; the overall objective is $J_{total}=J_{sa}+J_{fa}+J_{ma}$ under a perturbation budget, as formalized in Definition $2.1$. Empirical results across six datasets and two tasks show substantial degradation of SAM2, with notable cross-prompt and cross-model transferability and superior performance over SOTA baselines, while defenses like pruning and input corruption offer limited robustness. The work highlights critical vulnerabilities of video segmentation foundation models and lays groundwork for developing robust SAM2 variants and defense mechanisms.
Abstract
Recent studies reveal the vulnerability of the image segmentation foundation model SAM to adversarial examples. Its successor, SAM2, has attracted significant attention due to its strong generalization capability in video segmentation. However, its robustness remains unexplored, and it is unclear whether existing attacks on SAM can be directly transferred to SAM2. In this paper, we first analyze the performance gap of existing attacks between SAM and SAM2 and highlight two key challenges arising from their architectural differences: directional guidance from the prompt and semantic entanglement across consecutive frames. To address these issues, we propose UAP-SAM2, the first cross-prompt universal adversarial attack against SAM2 driven by dual semantic deviation. For cross-prompt transferability, we begin by designing a target-scanning strategy that divides each frame into k regions, each randomly assigned a prompt, to reduce prompt dependency during optimization. For effectiveness, we design a dual semantic deviation framework that optimizes a UAP by distorting the semantics within the current frame and disrupting the semantic consistency across consecutive frames. Extensive experiments on six datasets across two segmentation tasks demonstrate the effectiveness of the proposed method for SAM2. The comparative results show that UAP-SAM2 significantly outperforms state-of-the-art (SOTA) attacks by a large margin.
