Table of Contents
Fetching ...

Adversarial Example Soups: Improving Transferability and Stealthiness for Free

Bo Yang, Hengwei Zhang, Jindong Wang, Yulong Yang, Chenhao Lin, Chao Shen, Zhengyu Zhao

TL;DR

This work tackles the inefficiency in adversarial transferability research stemming from discarding suboptimal perturbations during hyperparameter tuning and stability tests. It introduces Adversarial Example Soups (AES), which averages multiple discarded adversarial samples (AES-tune and AES-rand) to improve transferability and stealthiness, drawing inspiration from model soups. Across 10 state-of-the-art attacks and 10 defensive target models, AES yields significant gains and can be applied to integrated attacks and in-the-wild scenarios. The approach enhances loss landscape flatness and produces higher-quality adversarial images, offering a practical, broadly applicable boost to black-box transfer attacks while also informing defense strategies.

Abstract

Transferable adversarial examples cause practical security risks since they can mislead a target model without knowing its internal knowledge. A conventional recipe for maximizing transferability is to keep only the optimal adversarial example from all those obtained in the optimization pipeline. In this paper, for the first time, we revisit this convention and demonstrate that those discarded, sub-optimal adversarial examples can be reused to boost transferability. Specifically, we propose ``Adversarial Example Soups'' (AES), with AES-tune for averaging discarded adversarial examples in hyperparameter tuning and AES-rand for stability testing. In addition, our AES is inspired by ``model soups'', which averages weights of multiple fine-tuned models for improved accuracy without increasing inference time. Extensive experiments validate the global effectiveness of our AES, boosting 10 state-of-the-art transfer attacks and their combinations by up to 13\% against 10 diverse (defensive) target models. We also show the possibility of generalizing AES to other types, \textit{e.g.}, directly averaging multiple in-the-wild adversarial examples that yield comparable success. A promising byproduct of AES is the improved stealthiness of adversarial examples since the perturbation variances are naturally reduced.

Adversarial Example Soups: Improving Transferability and Stealthiness for Free

TL;DR

This work tackles the inefficiency in adversarial transferability research stemming from discarding suboptimal perturbations during hyperparameter tuning and stability tests. It introduces Adversarial Example Soups (AES), which averages multiple discarded adversarial samples (AES-tune and AES-rand) to improve transferability and stealthiness, drawing inspiration from model soups. Across 10 state-of-the-art attacks and 10 defensive target models, AES yields significant gains and can be applied to integrated attacks and in-the-wild scenarios. The approach enhances loss landscape flatness and produces higher-quality adversarial images, offering a practical, broadly applicable boost to black-box transfer attacks while also informing defense strategies.

Abstract

Transferable adversarial examples cause practical security risks since they can mislead a target model without knowing its internal knowledge. A conventional recipe for maximizing transferability is to keep only the optimal adversarial example from all those obtained in the optimization pipeline. In this paper, for the first time, we revisit this convention and demonstrate that those discarded, sub-optimal adversarial examples can be reused to boost transferability. Specifically, we propose ``Adversarial Example Soups'' (AES), with AES-tune for averaging discarded adversarial examples in hyperparameter tuning and AES-rand for stability testing. In addition, our AES is inspired by ``model soups'', which averages weights of multiple fine-tuned models for improved accuracy without increasing inference time. Extensive experiments validate the global effectiveness of our AES, boosting 10 state-of-the-art transfer attacks and their combinations by up to 13\% against 10 diverse (defensive) target models. We also show the possibility of generalizing AES to other types, \textit{e.g.}, directly averaging multiple in-the-wild adversarial examples that yield comparable success. A promising byproduct of AES is the improved stealthiness of adversarial examples since the perturbation variances are naturally reduced.
Paper Structure (20 sections, 2 equations, 6 figures, 11 tables)

This paper contains 20 sections, 2 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: Our Adversarial Example Soups (AES) attack consistently improves the transferability from Inc-v3 to diverse (defensive) target models on ImageNet. AES-tune (AES-rand) averages 10 sessions of adversarial examples from hyperparameter tuning (stability testing). Here we report the results for the well-known DIM b36 attack and other attacks also yield similar patterns.
  • Figure 2: The framework of our Adversarial Example Soups (AES) attack. For each original image, AES averages its corresponding adversarial examples generated from $m$ different experimental sessions of hyperparameter tuning or stability testing. Each session takes a specific setting of hyperparameters.
  • Figure 3: Visualizations of adversarial images generated by SSA with different hyperparameters vs. our AES. AES leads to higher image quality.
  • Figure 4: Ablation study on the number of sessions $m$. The surrogate model is Inc-v3.
  • Figure 5: Visualization of loss surfaces along two random directions for one adversarial image and flatness (lower is better) averaged over 1000 adversarial images on Inc-v3. AES leads to flatter local maxima.
  • ...and 1 more figures