Table of Contents
Fetching ...

ZQBA: Zero Query Black-box Adversarial Attack

Joana C. Costa, Tiago Roxo, Hugo Proença, Pedro R. M. Inácio

TL;DR

ZQBA introduces a zero-query black-box adversarial attack that leverages feature maps from a surrogate DNN to generate perturbations added to clean images, without querying the target model. The method demonstrates cross-architecture and cross-dataset transferability (CIFAR-10/100 and Tiny ImageNet) and achieves competitive degradation in accuracy compared to single-query baselines while preserving high perceptual image quality (SSIM). Through ablation studies, guided backpropagation-derived feature maps and a random feature-map selection strategy are shown to balance attack strength and imperceptibility, with an optimal perturbation weight $\alpha$ of 0.4. The work highlights practical vulnerabilities in real-world DNN deployments and provides open-source code for reproducibility.

Abstract

Current black-box adversarial attacks either require multiple queries or diffusion models to produce adversarial samples that can impair the target model performance. However, these methods require training a surrogate loss or diffusion models to produce adversarial samples, which limits their applicability in real-world settings. Thus, we propose a Zero Query Black-box Adversarial (ZQBA) attack that exploits the representations of Deep Neural Networks (DNNs) to fool other networks. Instead of requiring thousands of queries to produce deceiving adversarial samples, we use the feature maps obtained from a DNN and add them to clean images to impair the classification of a target model. The results suggest that ZQBA can transfer the adversarial samples to different models and across various datasets, namely CIFAR and Tiny ImageNet. The experiments also show that ZQBA is more effective than state-of-the-art black-box attacks with a single query, while maintaining the imperceptibility of perturbations, evaluated both quantitatively (SSIM) and qualitatively, emphasizing the vulnerabilities of employing DNNs in real-world contexts. All the source code is available at https://github.com/Joana-Cabral/ZQBA.

ZQBA: Zero Query Black-box Adversarial Attack

TL;DR

ZQBA introduces a zero-query black-box adversarial attack that leverages feature maps from a surrogate DNN to generate perturbations added to clean images, without querying the target model. The method demonstrates cross-architecture and cross-dataset transferability (CIFAR-10/100 and Tiny ImageNet) and achieves competitive degradation in accuracy compared to single-query baselines while preserving high perceptual image quality (SSIM). Through ablation studies, guided backpropagation-derived feature maps and a random feature-map selection strategy are shown to balance attack strength and imperceptibility, with an optimal perturbation weight of 0.4. The work highlights practical vulnerabilities in real-world DNN deployments and provides open-source code for reproducibility.

Abstract

Current black-box adversarial attacks either require multiple queries or diffusion models to produce adversarial samples that can impair the target model performance. However, these methods require training a surrogate loss or diffusion models to produce adversarial samples, which limits their applicability in real-world settings. Thus, we propose a Zero Query Black-box Adversarial (ZQBA) attack that exploits the representations of Deep Neural Networks (DNNs) to fool other networks. Instead of requiring thousands of queries to produce deceiving adversarial samples, we use the feature maps obtained from a DNN and add them to clean images to impair the classification of a target model. The results suggest that ZQBA can transfer the adversarial samples to different models and across various datasets, namely CIFAR and Tiny ImageNet. The experiments also show that ZQBA is more effective than state-of-the-art black-box attacks with a single query, while maintaining the imperceptibility of perturbations, evaluated both quantitatively (SSIM) and qualitatively, emphasizing the vulnerabilities of employing DNNs in real-world contexts. All the source code is available at https://github.com/Joana-Cabral/ZQBA.

Paper Structure

This paper contains 10 sections, 1 equation, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Comparison between multiple queries black-box approaches and the proposed attack: Zero-Query Black-box Adversarial (ZQBA) attack. During the attack phase, black-box attacks query the target model thousands of times to obtain the logits related to the provided images, and adapt a loss function based on the model responses to generate better perturbations. On the other hand, ZQBA has two phases: 1) Setup, where all the feature maps (perturbations) are obtained; and 2) Attack, where previously obtained perturbations are added to the image to be attacked, without querying the target model.
  • Figure 2: Overview of the methodology used to obtain the perturbations for the ZQBA attack, through the extraction of feature maps, using Guided Backpropagation, from the layer prior to the classification layers.
  • Figure 3: Accuracy (%) and SSIM (%) of ZQBA with different weights for the feature maps in multiple architectures. The lines refer to an average of each model performance across the three considered datasets.
  • Figure 4: Original Image, Feature Map, Target Image, and Attacked Image for CIFAR, in the first three rows, and Tiny ImageNet, in the last three. The Original Image refers to the image used to obtain the Feature Map, the Target Image is the one whose classification the attacker wants to compromise, and the Attacked Image is the combination of the Feature Map and the Target Image.