FSPGD: Rethinking Black-box Attacks on Semantic Segmentation
Eun-Sol Park, MiSo Park, Seung Park, Yong-Goo Shin
TL;DR
The paper tackles the limited transferability of black-box adversarial attacks for semantic segmentation by introducing FSPGD, which leverages gradients from intermediate-layer features rather than outputs. It defines two feature-based losses, $L_{ex}$ and $L_{in}$, and combines them into a dynamic objective $L=\lambda_t L_{ex}+(1-\lambda_t)L_{in}$ to drive local dissimilarity and contextual disruption, with a Gram-matrix-based internal similarity term and a binarized spatial mask. Empirical results on VOC 2012 and Cityscapes show FSPGD achieves state-of-the-art transferability across diverse backbones, including transformer-based architectures, and extensive ablations validate the benefits of middle-layer attacks, dynamic weighting, and the threshold $\tau=\cos(\pi/3)$. The work provides new benchmarks for black-box segmentation attacks and highlights the importance of attacking intermediate representations to improve cross-model robustness and transferability.
Abstract
Transferability, the ability of adversarial examples crafted for one model to deceive other models, is crucial for black-box attacks. Despite advancements in attack methods for semantic segmentation, transferability remains limited, reducing their effectiveness in real-world applications. To address this, we introduce the Feature Similarity Projected Gradient Descent (FSPGD) attack, a novel black-box approach that enhances both attack performance and transferability. Unlike conventional segmentation attacks that rely on output predictions for gradient calculation, FSPGD computes gradients from intermediate layer features. Specifically, our method introduces a loss function that targets local information by comparing features between clean images and adversarial examples, while also disrupting contextual information by accounting for spatial relationships between objects. Experiments on Pascal VOC 2012 and Cityscapes datasets demonstrate that FSPGD achieves superior transferability and attack performance, establishing a new state-of-the-art benchmark. Code is available at https://github.com/KU-AIVS/FSPGD.
