Table of Contents
Fetching ...

Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior

Yi Yu, Yufei Wang, Wenhan Yang, Lanqing Guo, Shijian Lu, Ling-Yu Duan, Yap-Peng Tan, Alex C. Kot

TL;DR

This work demonstrates a novel backdoor attack against learned image compression by injecting adaptive triggers in the DCT frequency domain, enabling multiple triggers to control bit-rate, reconstruction quality, and downstream vision tasks. The authors propose a dynamic loss and a two-stage training procedure that finetunes only the encoder, achieving strong, transferable backdoors that resist several preprocessing defenses and transfer across models and domains. They validate effectiveness on multiple compression architectures and demonstrate both targeted (e.g., CarToRoad) and ambient (e.g., face privacy) attacks, along with extensive ablations. The study highlights practical security risks in end-to-end learned compression and proposes strategies to measure and potentially mitigate such vulnerabilities.

Abstract

Recent advancements in deep learning-based compression techniques have surpassed traditional methods. However, deep neural networks remain vulnerable to backdoor attacks, where pre-defined triggers induce malicious behaviors. This paper introduces a novel frequency-based trigger injection model for launching backdoor attacks with multiple triggers on learned image compression models. Inspired by the widely used DCT in compression codecs, triggers are embedded in the DCT domain. We design attack objectives tailored to diverse scenarios, including: 1) degrading compression quality in terms of bit-rate and reconstruction accuracy; 2) targeting task-driven measures like face recognition and semantic segmentation. To improve training efficiency, we propose a dynamic loss function that balances loss terms with fewer hyper-parameters, optimizing attack objectives effectively. For advanced scenarios, we evaluate the attack's resistance to defensive preprocessing and propose a two-stage training schedule with robust frequency selection to enhance resilience. To improve cross-model and cross-domain transferability for downstream tasks, we adjust the classification boundary in the attack loss during training. Experiments show that our trigger injection models, combined with minor modifications to encoder parameters, successfully inject multiple backdoors and their triggers into a single compression model, demonstrating strong performance and versatility. (*Due to the notification of arXiv "The Abstract field cannot be longer than 1,920 characters", the appeared Abstract is shortened. For the full Abstract, please download the Article.)

Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior

TL;DR

This work demonstrates a novel backdoor attack against learned image compression by injecting adaptive triggers in the DCT frequency domain, enabling multiple triggers to control bit-rate, reconstruction quality, and downstream vision tasks. The authors propose a dynamic loss and a two-stage training procedure that finetunes only the encoder, achieving strong, transferable backdoors that resist several preprocessing defenses and transfer across models and domains. They validate effectiveness on multiple compression architectures and demonstrate both targeted (e.g., CarToRoad) and ambient (e.g., face privacy) attacks, along with extensive ablations. The study highlights practical security risks in end-to-end learned compression and proposes strategies to measure and potentially mitigate such vulnerabilities.

Abstract

Recent advancements in deep learning-based compression techniques have surpassed traditional methods. However, deep neural networks remain vulnerable to backdoor attacks, where pre-defined triggers induce malicious behaviors. This paper introduces a novel frequency-based trigger injection model for launching backdoor attacks with multiple triggers on learned image compression models. Inspired by the widely used DCT in compression codecs, triggers are embedded in the DCT domain. We design attack objectives tailored to diverse scenarios, including: 1) degrading compression quality in terms of bit-rate and reconstruction accuracy; 2) targeting task-driven measures like face recognition and semantic segmentation. To improve training efficiency, we propose a dynamic loss function that balances loss terms with fewer hyper-parameters, optimizing attack objectives effectively. For advanced scenarios, we evaluate the attack's resistance to defensive preprocessing and propose a two-stage training schedule with robust frequency selection to enhance resilience. To improve cross-model and cross-domain transferability for downstream tasks, we adjust the classification boundary in the attack loss during training. Experiments show that our trigger injection models, combined with minor modifications to encoder parameters, successfully inject multiple backdoors and their triggers into a single compression model, demonstrating strong performance and versatility. (*Due to the notification of arXiv "The Abstract field cannot be longer than 1,920 characters", the appeared Abstract is shortened. For the full Abstract, please download the Article.)

Paper Structure

This paper contains 26 sections, 17 equations, 18 figures, 10 tables, 1 algorithm.

Figures (18)

  • Figure 1: Visualization of the proposed backdoor-injected model with multiple triggers attacking bit-rate (BPP) or reconstruction quality (PSNR), respectively. The second sample shows the result of the BPP attack with a huge increase in bit-rate, and the third one presents a PSNR attack with severely corrupted output.
  • Figure 2: Overall architecture for trigger injection. We set $K$ to 16 for top K selection, and the number of middle frequencies $N$ to 64 in our methods.
  • Figure 3: In the training stage, we finetune $g_a\left(\cdot | {\theta_a}\right)$ and train each $T\left(\cdot | {\theta_t^o}\right)$. In the inference stage, we generate poisoned images (e.g., PSNR attack), feed them into the finetuned encoder and the entropy model, and save the bitstream of the poisoned images.
  • Figure 4: Rate-distortion curves of attacking compression results on Kodak dataset. C and P denote using clean input and poisoned input, respectively.
  • Figure 5: PSNR attack: visual result of attacked outputs to various poisoned inputs with kodim6 from Kodak (AE-Hyperior balle2018variational with a quality level = 5).
  • ...and 13 more figures