Table of Contents
Fetching ...

Enhancing Transferability of Targeted Adversarial Examples: A Self-Universal Perspective

Bowen Peng, Li Liu, Tianpeng Liu, Zhen Liu, Yongxiang Liu

TL;DR

This work tackles the challenge of transfer-based targeted adversarial attacks on black-box DNNs by introducing a self-universal perspective that leverages input transformations to reveal intrinsic image semantics. It identifies simple image scaling as a foundational, scalable, sufficient, and necessary transformation for boosting targeted transferability and builds S$^4$ST by adding orthogonal augmentation and block-wise scaling to further diversify the transformed inputs. On ImageNet-Compatible benchmarks, S$^4$ST achieves a 19.8% absolute improvement in average targeted transfer success compared with state-of-the-art transformations while requiring only 36% of the training time, and it outperforms resource-intensive methods in multiple settings, including transfers to real-world APIs. The findings support the idea that exploiting the target image itself through strong, self-universal transformations can rival data-intensive approaches, with practical implications for evaluating and understanding model vulnerabilities and transferability in adversarial contexts.

Abstract

Transfer-based targeted adversarial attacks against black-box deep neural networks (DNNs) have been proven to be significantly more challenging than untargeted ones. The impressive transferability of current SOTA, the generative methods, comes at the cost of requiring massive amounts of additional data and time-consuming training for each targeted label. This results in limited efficiency and flexibility, significantly hindering their deployment in practical applications. In this paper, we offer a self-universal perspective that unveils the great yet underexplored potential of input transformations in pursuing this goal. Specifically, transformations universalize gradient-based attacks with intrinsic but overlooked semantics inherent within individual images, exhibiting similar scalability and comparable results to time-consuming learning over massive additional data from diverse classes. We also contribute a surprising empirical insight that one of the most fundamental transformations, simple image scaling, is highly effective, scalable, sufficient, and necessary in enhancing targeted transferability. We further augment simple scaling with orthogonal transformations and block-wise applicability, resulting in the Simple, faSt, Self-universal yet Strong Scale Transformation (S$^4$ST) for self-universal TTA. On the ImageNet-Compatible benchmark dataset, our method achieves a 19.8% improvement in the average targeted transfer success rate against various challenging victim models over existing SOTA transformation methods while only consuming 36% time for attacking. It also outperforms resource-intensive attacks by a large margin in various challenging settings.

Enhancing Transferability of Targeted Adversarial Examples: A Self-Universal Perspective

TL;DR

This work tackles the challenge of transfer-based targeted adversarial attacks on black-box DNNs by introducing a self-universal perspective that leverages input transformations to reveal intrinsic image semantics. It identifies simple image scaling as a foundational, scalable, sufficient, and necessary transformation for boosting targeted transferability and builds SST by adding orthogonal augmentation and block-wise scaling to further diversify the transformed inputs. On ImageNet-Compatible benchmarks, SST achieves a 19.8% absolute improvement in average targeted transfer success compared with state-of-the-art transformations while requiring only 36% of the training time, and it outperforms resource-intensive methods in multiple settings, including transfers to real-world APIs. The findings support the idea that exploiting the target image itself through strong, self-universal transformations can rival data-intensive approaches, with practical implications for evaluating and understanding model vulnerabilities and transferability in adversarial contexts.

Abstract

Transfer-based targeted adversarial attacks against black-box deep neural networks (DNNs) have been proven to be significantly more challenging than untargeted ones. The impressive transferability of current SOTA, the generative methods, comes at the cost of requiring massive amounts of additional data and time-consuming training for each targeted label. This results in limited efficiency and flexibility, significantly hindering their deployment in practical applications. In this paper, we offer a self-universal perspective that unveils the great yet underexplored potential of input transformations in pursuing this goal. Specifically, transformations universalize gradient-based attacks with intrinsic but overlooked semantics inherent within individual images, exhibiting similar scalability and comparable results to time-consuming learning over massive additional data from diverse classes. We also contribute a surprising empirical insight that one of the most fundamental transformations, simple image scaling, is highly effective, scalable, sufficient, and necessary in enhancing targeted transferability. We further augment simple scaling with orthogonal transformations and block-wise applicability, resulting in the Simple, faSt, Self-universal yet Strong Scale Transformation (SST) for self-universal TTA. On the ImageNet-Compatible benchmark dataset, our method achieves a 19.8% improvement in the average targeted transfer success rate against various challenging victim models over existing SOTA transformation methods while only consuming 36% time for attacking. It also outperforms resource-intensive attacks by a large margin in various challenging settings.
Paper Structure (16 sections, 5 equations, 9 figures, 5 tables)

This paper contains 16 sections, 5 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Comparison of targeted transfer success rate (tSuc) and time consumption for training and attack (minutes), averaged over 14 black-box models and 1,000 images. The surrogate model is ResNet-50, and the target label is street sign. Our self-universal hypothesis collaborates (a) the universality to additional samples and (b) the self-universality to transformed copies of a single sample, motivating us to find a potent transformation for flexible, efficient, and powerful self-universal TTA. We are particularly interested in the simple scaling transformation, which we empirically demonstrate as scalable, sufficient, and necessary for enhancing targeted transferability. We further propose the S$^4$ST transformation, which surpasses both SOTA transformation methods and resource-intensive methods; please see Figs. \ref{['transformations']} and \ref{['t-s-plot']}.
  • Figure 2: Illustration of the original image and its transformed copies by existing methods and our S$^4$ST.
  • Figure 3: Comparison of the average tSuc and time consumption (seconds) to craft an AE on the ImageNet-Compatible dataset. The results are obtained using TMI as the baseline attack with a perturbation budget of $\ell_{\infty}=16/255$ at different iterations $T$. Our S$^4$ST exhibits superior effectiveness and efficiency over existing transformations. Compared with SOTA transformation, BSR BSR, S$^4$ST yields an absolute improvement of 19.8% on average tSuc to 79.8%, and a relative reduction of 64% on time consumption to 0.41s.
  • Figure 4: Average tSuc evaluated on the ImageNet-Compatible dataset under 10-Targets (all-source) setting naseer2022stylized. Input transformations are identified as playing an indispensable role in enabling effective simple TTA. Moreover, the transformation strength for simple TTA and the training sample volumes for resource-intensive TTA exhibit similar impacts on attack performance.
  • Figure 5: Focus of ResNet50 varies significantly across scales.
  • ...and 4 more figures