Rounding-Guided Backdoor Injection in Deep Learning Model Quantization
Xiangxiang Chen, Peixin Zhang, Jun Sun, Wenhai Wang, Jingyi Wang
TL;DR
This work exposes a new supply-chain vulnerability in neural-network quantization by showing that backdoors can be implanted exclusively during the post-training quantization phase through rounding manipulation. The authors introduce QuRA, a training-agnostic attack that crafts a lightweight backdoor trigger and progressively biases the rounding of selected weights across layers while preserving overall accuracy. Extensive experiments across CV and NLP models demonstrate near-100% attack success rates with minimal clean accuracy loss, even under several defense mechanisms, highlighting a significant risk in deployment-time quantization workflows. The findings emphasize the need for robust verification of rounding behavior in quantization tools and caution against outsourcing deployment pipelines without security guarantees.
Abstract
Model quantization is a popular technique for deploying deep learning models on resource-constrained environments. However, it may also introduce previously overlooked security risks. In this work, we present QuRA, a novel backdoor attack that exploits model quantization to embed malicious behaviors. Unlike conventional backdoor attacks relying on training data poisoning or model training manipulation, QuRA solely works using the quantization operations. In particular, QuRA first employs a novel weight selection strategy to identify critical weights that influence the backdoor target (with the goal of perserving the model's overall performance in mind). Then, by optimizing the rounding direction of these weights, we amplify the backdoor effect across model layers without degrading accuracy. Extensive experiments demonstrate that QuRA achieves nearly 100% attack success rates in most cases, with negligible performance degradation. Furthermore, we show that QuRA can adapt to bypass existing backdoor defenses, underscoring its threat potential. Our findings highlight critical vulnerability in widely used model quantization process, emphasizing the need for more robust security measures. Our implementation is available at https://github.com/cxx122/QuRA.
