Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability
Junqi Gao, Biqing Qi, Yao Li, Zhichang Guo, Dong Li, Yuming Xing, Dazhi Zhang
TL;DR
This work investigates targeted adversarial transferability by identifying High-Sample-Density-Regions (HSDR) as regions where different models behave consistently. It shows that perturbations toward the HSDR of the target class yield more transferable targeted attacks, aided by the observation that easy samples with low loss tend to lie in HSDR, enabling density-free guidance. The authors introduce ESMA, a multi-target, generative attack that uses a two-stage training strategy: pre-trained latent-feature embeddings guided by manifold-matching, and a Unet-based generator trained to map source samples toward easy-target anchors, allowing a single model to attack all classes with reduced storage and computation. Empirical results on ImageNet demonstrate ESMA’s superior targeted transferability compared to the state-of-the-art TTP, along with substantial efficiency gains, supported by ablations that validate the components and the Loss+Gradnorm screening concept. Overall, the work provides both practical attack tooling and theoretical insight into model output consistency in HSDR, with implications for dataset design and defense research.
Abstract
The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class instead of low sample density regions. Therefore, in the target setting, adding perturbations towards HSDR of the target class is more effective in improving transferability. However, density estimation is challenging in high-dimensional scenarios. Further theoretical and experimental verification demonstrates that easy samples with low loss are more likely to be located in HSDR. Perturbations towards such easy samples in the target class can avoid density estimation for HSDR location. Based on the above facts, we verified that adding perturbations to easy samples in the target class improves targeted adversarial transferability of existing attack methods. A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed, which has a higher success rate for targeted attacks and outperforms the SOTA generative method. Moreover, ESMA requires only 5% of the storage space and much less computation time comparing to the current SOTA, as ESMA attacks all classes with only one model instead of seperate models for each class. Our code is available at https://github.com/gjq100/ESMA.
