Table of Contents
Fetching ...

Breaking the Limits of Quantization-Aware Defenses: QADT-R for Robustness Against Patch-Based Adversarial Attacks in QNNs

Amira Guesmi, Bassem Ouni, Muhammad Shafique

TL;DR

This paper investigates patch-based adversarial attacks in quantized neural networks (QNNs) and demonstrates that such attacks remain highly transferable even at extreme quantization (e.g., 2-bit). To counter this vulnerability, it proposes Quantization-Aware Defense Training with Randomization (QADT-R), which combines Adaptive Quantization-Aware Patch Generation (A-QAPA), Dynamic Bit-Width Training (DBWT), and Gradient-Inconsistent Regularization (GIR). The approach shows substantial reductions in patch-based attack success rates (ASR) across CIFAR-10 and ImageNet while preserving clean accuracy, outperforming prior defenses like PBAT and DWQ, especially under unseen patch configurations. These findings highlight the need for quantization-aware, bit-width-agnostic defenses and provide practical insights into why patch-based attacks persist and how to mitigate them in resource-constrained deployments.

Abstract

Quantized Neural Networks (QNNs) have emerged as a promising solution for reducing model size and computational costs, making them well-suited for deployment in edge and resource-constrained environments. While quantization is known to disrupt gradient propagation and enhance robustness against pixel-level adversarial attacks, its effectiveness against patch-based adversarial attacks remains largely unexplored. In this work, we demonstrate that adversarial patches remain highly transferable across quantized models, achieving over 70\% attack success rates (ASR) even at extreme bit-width reductions (e.g., 2-bit). This challenges the common assumption that quantization inherently mitigates adversarial threats. To address this, we propose Quantization-Aware Defense Training with Randomization (QADT-R), a novel defense strategy that integrates Adaptive Quantization-Aware Patch Generation (A-QAPA), Dynamic Bit-Width Training (DBWT), and Gradient-Inconsistent Regularization (GIR) to enhance resilience against highly transferable patch-based attacks. A-QAPA generates adversarial patches within quantized models, ensuring robustness across different bit-widths. DBWT introduces bit-width cycling during training to prevent overfitting to a specific quantization setting, while GIR injects controlled gradient perturbations to disrupt adversarial optimization. Extensive evaluations on CIFAR-10 and ImageNet show that QADT-R reduces ASR by up to 25\% compared to prior defenses such as PBAT and DWQ. Our findings further reveal that PBAT-trained models, while effective against seen patch configurations, fail to generalize to unseen patches due to quantization shift. Additionally, our empirical analysis of gradient alignment, spatial sensitivity, and patch visibility provides insights into the mechanisms that contribute to the high transferability of patch-based attacks in QNNs.

Breaking the Limits of Quantization-Aware Defenses: QADT-R for Robustness Against Patch-Based Adversarial Attacks in QNNs

TL;DR

This paper investigates patch-based adversarial attacks in quantized neural networks (QNNs) and demonstrates that such attacks remain highly transferable even at extreme quantization (e.g., 2-bit). To counter this vulnerability, it proposes Quantization-Aware Defense Training with Randomization (QADT-R), which combines Adaptive Quantization-Aware Patch Generation (A-QAPA), Dynamic Bit-Width Training (DBWT), and Gradient-Inconsistent Regularization (GIR). The approach shows substantial reductions in patch-based attack success rates (ASR) across CIFAR-10 and ImageNet while preserving clean accuracy, outperforming prior defenses like PBAT and DWQ, especially under unseen patch configurations. These findings highlight the need for quantization-aware, bit-width-agnostic defenses and provide practical insights into why patch-based attacks persist and how to mitigate them in resource-constrained deployments.

Abstract

Quantized Neural Networks (QNNs) have emerged as a promising solution for reducing model size and computational costs, making them well-suited for deployment in edge and resource-constrained environments. While quantization is known to disrupt gradient propagation and enhance robustness against pixel-level adversarial attacks, its effectiveness against patch-based adversarial attacks remains largely unexplored. In this work, we demonstrate that adversarial patches remain highly transferable across quantized models, achieving over 70\% attack success rates (ASR) even at extreme bit-width reductions (e.g., 2-bit). This challenges the common assumption that quantization inherently mitigates adversarial threats. To address this, we propose Quantization-Aware Defense Training with Randomization (QADT-R), a novel defense strategy that integrates Adaptive Quantization-Aware Patch Generation (A-QAPA), Dynamic Bit-Width Training (DBWT), and Gradient-Inconsistent Regularization (GIR) to enhance resilience against highly transferable patch-based attacks. A-QAPA generates adversarial patches within quantized models, ensuring robustness across different bit-widths. DBWT introduces bit-width cycling during training to prevent overfitting to a specific quantization setting, while GIR injects controlled gradient perturbations to disrupt adversarial optimization. Extensive evaluations on CIFAR-10 and ImageNet show that QADT-R reduces ASR by up to 25\% compared to prior defenses such as PBAT and DWQ. Our findings further reveal that PBAT-trained models, while effective against seen patch configurations, fail to generalize to unseen patches due to quantization shift. Additionally, our empirical analysis of gradient alignment, spatial sensitivity, and patch visibility provides insights into the mechanisms that contribute to the high transferability of patch-based attacks in QNNs.

Paper Structure

This paper contains 27 sections, 11 equations, 4 figures, 21 tables.

Figures (4)

  • Figure 1: Model accuracy for ImageNet ResNet-34 across different quantization levels under various pixel-level adversarial attacks.
  • Figure 2: Overview of Our proposed methodology QADT-R to enhance robustness against patch-based adversarial attacks in QNNs.
  • Figure 3: Feature maps of the 32-bit, 8-bit, 4-bit, and 2-bit models comparing the clean and patched feature maps for the three first convolutional layers.
  • Figure 4: Gradient maps for 32-bit, 8-bit, 4-bit, and 2-bit models under patch-based and pixel-level attacks, along with Cosine Similarity and MSE measurements comparing gradients between the full-precision and quantized models.