Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantization

Yulong Yang; Chenhao Lin; Qian Li; Zhengyu Zhao; Haoran Fan; Dawei Zhou; Nannan Wang; Tongliang Liu; Chao Shen

Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantization

Yulong Yang, Chenhao Lin, Qian Li, Zhengyu Zhao, Haoran Fan, Dawei Zhou, Nannan Wang, Tongliang Liu, Chao Shen

TL;DR

The paper tackles the problem of transferring adversarial examples to quantized neural networks (QNNs) with unknown architectures and bitwidths in black-box settings. It reveals that using extremely low-bitwidth substitute models can boost cross-architecture transferability and introduces Quantization Aware Attack (QAA), which trains substitute models with a dual-bitwidth objective to reduce quantization shift and gradient misalignment. Through extensive experiments on ImageNet and CIFAR-10, QAA consistently improves transferability against standardly trained, adversarially trained, and QNN targets, often with modest training overhead (one epoch) and compatible with both QAT and PTQ workflows. The work also analyzes the effectiveness of QAA via loss-landscape perspectives and gradient alignment, underscoring practical implications for evaluating and securing QNN deployments.

Abstract

Quantized neural networks (QNNs) have received increasing attention in resource-constrained scenarios due to their exceptional generalizability. However, their robustness against realistic black-box adversarial attacks has not been extensively studied. In this scenario, adversarial transferability is pursued across QNNs with different quantization bitwidths, which particularly involve unknown architectures and defense methods. Previous studies claim that transferability is difficult to achieve across QNNs with different bitwidths on the condition that they share the same architecture. However, we discover that under different architectures, transferability can be largely improved by using a QNN quantized with an extremely low bitwidth as the substitute model. We further improve the attack transferability by proposing \textit{quantization aware attack} (QAA), which fine-tunes a QNN substitute model with a multiple-bitwidth training objective. In particular, we demonstrate that QAA addresses the two issues that are commonly known to hinder transferability: 1) quantization shifts and 2) gradient misalignments. Extensive experimental results validate the high transferability of the QAA to diverse target models. For instance, when adopting the ResNet-34 substitute model on ImageNet, QAA outperforms the current best attack in attacking standardly trained DNNs, adversarially trained DNNs, and QNNs with varied bitwidths by 4.3\% $\sim$ 20.9\%, 8.7\% $\sim$ 15.5\%, and 2.6\% $\sim$ 31.1\% (absolute), respectively. In addition, QAA is efficient since it only takes one epoch for fine-tuning. In the end, we empirically explain the effectiveness of QAA from the view of the loss landscape. Our code is available at https://github.com/yyl-github-1896/QAA/

Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantization

TL;DR

Abstract

20.9\%, 8.7\%

15.5\%, and 2.6\%

31.1\% (absolute), respectively. In addition, QAA is efficient since it only takes one epoch for fine-tuning. In the end, we empirically explain the effectiveness of QAA from the view of the loss landscape. Our code is available at https://github.com/yyl-github-1896/QAA/

Paper Structure (22 sections, 11 equations, 7 figures, 10 tables, 2 algorithms)

This paper contains 22 sections, 11 equations, 7 figures, 10 tables, 2 algorithms.

Introduction
Related Work
Transfer-based Black-box Attacks
Attacks and Defenses for QNNs
DNN Quantization
Methodology
Preliminaries: Transfer-based Attacks against QNNs
Problem Formulation
Objective Function
Attack Implementation
Quantitative Analysis on QAA
Experiments
Experimental Settings
Transferability of the QAA
Ablation Study
...and 7 more sections

Figures (7)

Figure 1: A brief overview of the QAA substitute model.
Figure 2: The BN challenge of training substitute model with multiple bitwidths.
Figure 3: Illustration of how the QAA mitigates quantization shift. (a), (b), (c), and (d) show the feature divergence exhibited by 2, 3, 4, and 5-bit target QNNs, respectively.
Figure 4: Visualization of the gradient alignment issue on ImageNet.
Figure 5: Ablation study results. (a) shows the comparison results between the QAA and single-QNN based attacks (5, 4, 3, and 2-bit, respectively). (b) presents the comparison results between QAA and ensemble attacks with 32-bit and 2-bit models. We compare three different ensemble methods: logits, softmax, and sampling. (c) is the ablation studies on different fine-tuning objectives (A, B, C, D).
...and 2 more figures

Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantization

TL;DR

Abstract

Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)