Table of Contents
Fetching ...

ARQ: A Mixed-Precision Quantization Framework for Accurate and Certifiably Robust DNNs

Yuchen Yang, Shubham Ugare, Yifan Zhao, Gagandeep Singh, Sasa Misailovic

TL;DR

ARQ introduces a novel mixed-precision quantization framework that directly optimizes for certifiable robustness by integrating randomized smoothing into a reinforcement-learning search. It uses a DDPG-based policy to assign per-layer bit-widths under a resource constraint, with robustness quantified by the Average Certified Radius (ACR) and accelerated via Incremental Randomized Smoothing. Empirically, ARQ outperforms state-of-the-art MPQ baselines on CIFAR-10 and ImageNet, often matching or surpassing the original FP32 models in both clean accuracy and certified robustness while reducing compute by up to a few percent of the original operations. This approach enables scalable, certifiably robust quantization for large vision models, offering practical benefits for deployment on resource-constrained devices.

Abstract

Mixed precision quantization has become an important technique for optimizing the execution of deep neural networks (DNNs). Certified robustness, which provides provable guarantees about a model's ability to withstand different adversarial perturbations, has rarely been addressed in quantization due to unacceptably high cost of certifying robustness. This paper introduces ARQ, an innovative mixed-precision quantization method that not only preserves the clean accuracy of the smoothed classifiers but also maintains their certified robustness. ARQ uses reinforcement learning to find accurate and robust DNN quantization, while efficiently leveraging randomized smoothing, a popular class of statistical DNN verification algorithms. ARQ consistently performs better than multiple state-of-the-art quantization techniques across all the benchmarks and the input perturbation levels. The performance of ARQ quantized networks reaches that of the original DNN with floating-point weights, but with only 1.5% instructions and the highest certified radius. ARQ code is available at https://anonymous.4open.science/r/ARQ-FE4B.

ARQ: A Mixed-Precision Quantization Framework for Accurate and Certifiably Robust DNNs

TL;DR

ARQ introduces a novel mixed-precision quantization framework that directly optimizes for certifiable robustness by integrating randomized smoothing into a reinforcement-learning search. It uses a DDPG-based policy to assign per-layer bit-widths under a resource constraint, with robustness quantified by the Average Certified Radius (ACR) and accelerated via Incremental Randomized Smoothing. Empirically, ARQ outperforms state-of-the-art MPQ baselines on CIFAR-10 and ImageNet, often matching or surpassing the original FP32 models in both clean accuracy and certified robustness while reducing compute by up to a few percent of the original operations. This approach enables scalable, certifiably robust quantization for large vision models, offering practical benefits for deployment on resource-constrained devices.

Abstract

Mixed precision quantization has become an important technique for optimizing the execution of deep neural networks (DNNs). Certified robustness, which provides provable guarantees about a model's ability to withstand different adversarial perturbations, has rarely been addressed in quantization due to unacceptably high cost of certifying robustness. This paper introduces ARQ, an innovative mixed-precision quantization method that not only preserves the clean accuracy of the smoothed classifiers but also maintains their certified robustness. ARQ uses reinforcement learning to find accurate and robust DNN quantization, while efficiently leveraging randomized smoothing, a popular class of statistical DNN verification algorithms. ARQ consistently performs better than multiple state-of-the-art quantization techniques across all the benchmarks and the input perturbation levels. The performance of ARQ quantized networks reaches that of the original DNN with floating-point weights, but with only 1.5% instructions and the highest certified radius. ARQ code is available at https://anonymous.4open.science/r/ARQ-FE4B.

Paper Structure

This paper contains 28 sections, 1 theorem, 12 equations, 6 figures, 8 tables, 2 algorithms.

Key Result

Theorem 2.1

Suppose $c_A \in \mathcal{Y}$, $\underline{p_A}, \overline{p_B} \in [0, 1]$. if then $g(x+\delta) = c_A$ for all $\delta$ satisfying $\|\delta\|_2 \leq \frac{\sigma}{2} (\Phi^{-1}(\underline{p_A}) - \Phi^{-1}(\overline{p_B}))$, where $\Phi^{-1}$ denotes the inverse of the standard Gaussian CDF.

Figures (6)

  • Figure 1: Experiments on CIFAR-10. The x-axis shows the percentage of BitOPs of $f_P$ relative to the original floating-point $f$. The y-axis shows the ACR for the first three subfigures, and the average difference in clean accuracy between the methods and the original floating-point network across different $\sigma$ settings for Figure \ref{['fig:1d']}.
  • Figure 2: The figure illustrates the difference between empirical robustness and certified robustness. Images are from ImageNet and fed into ResNet-50. The noisy image has a noise level of 0.25. Known adversarial examples use FGSM goodfellow2015explainingharnessingadversarialexamples, and unknown attacks use PGD madry2019deeplearningmodelsresistant. The empirical defense method is Pixel Deflection prakash2018deflectingadversarialattackspixel.
  • Figure 3: Experiments on CIFAR-10. The x-axis shows the model size of $f_P$. The y-axis shows the ACR for the first three subfigures, and the average difference in clean accuracy between the methods and the original FP32 network across different $\sigma$ settings for Figure \ref{['fig:1d']}.
  • Figure 4: Quantization policy among different $\sigma$s for ResNet-50 on ImageNet. The x-axis represents the layer index, and the y-axis represents the bit-width selection in the quantization policy for each specific layer. The $\bullet$ symbol represents the bit-widths for weights, and the $\times$ symbol represents the bit-widths for activations.
  • Figure : ARQ Search Algorithm
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 2.1: From DBLP:conf/icml/CohenRK19