An Effective and Resilient Backdoor Attack Framework against Deep Neural Networks and Vision Transformers
Xueluan Gong, Bowei Tian, Meng Xue, Yuan Wu, Yanjiao Chen, Qian Wang
TL;DR
This work tackles the vulnerability of DNNs and Vision Transformers to backdoor attacks by designing an integrated framework that jointly learns trigger masks and backdoored models. It introduces an attention-based mask to select high-impact pixels, a QoE-aware loss including SSIM to produce natural-looking triggers, and a co-optimization loop with alternating retraining to preserve clean-data accuracy while boosting attack success. The approach extends to ViTs through gradient-enhanced trigger generation and head-layer neuron selection, achieving superior ASR and perceptual quality (low LPIPS) across multiple datasets and resisting state-of-the-art defenses. Overall, the method yields a more effective and stealthy backdoor attack with broad applicability and highlighted implications for defense design in both CNNs and ViTs.
Abstract
Recent studies have revealed the vulnerability of Deep Neural Network (DNN) models to backdoor attacks. However, existing backdoor attacks arbitrarily set the trigger mask or use a randomly selected trigger, which restricts the effectiveness and robustness of the generated backdoor triggers. In this paper, we propose a novel attention-based mask generation methodology that searches for the optimal trigger shape and location. We also introduce a Quality-of-Experience (QoE) term into the loss function and carefully adjust the transparency value of the trigger in order to make the backdoored samples to be more natural. To further improve the prediction accuracy of the victim model, we propose an alternating retraining algorithm in the backdoor injection process. The victim model is retrained with mixed poisoned datasets in even iterations and with only benign samples in odd iterations. Besides, we launch the backdoor attack under a co-optimized attack framework that alternately optimizes the backdoor trigger and backdoored model to further improve the attack performance. Apart from DNN models, we also extend our proposed attack method against vision transformers. We evaluate our proposed method with extensive experiments on VGG-Flower, CIFAR-10, GTSRB, CIFAR-100, and ImageNette datasets. It is shown that we can increase the attack success rate by as much as 82\% over baselines when the poison ratio is low and achieve a high QoE of the backdoored samples. Our proposed backdoor attack framework also showcases robustness against state-of-the-art backdoor defenses.
