Table of Contents
Fetching ...

A Training-Free Defense Framework for Robust Learned Image Compression

Myungseo Song, Jinyoung Choi, Bohyung Han

TL;DR

Learned image compression models are vulnerable to adversarial perturbations that can increase bitrate or distort reconstructions. The authors propose a training-free defense combining input randomization with a two-way compression scheme that selects the better encoding between the original and transformed inputs, using the rate-distortion objective to guide the choice, without retraining. The approach maintains performance on clean images while significantly improving robustness across multiple models and attack scenarios, including white-box and gray-box settings, and demonstrates generalization to FDA-style attacks. This practical defense can be applied to pretrained compressors with minimal overhead, offering a scalable solution for robust learned image compression.

Abstract

We study the robustness of learned image compression models against adversarial attacks and present a training-free defense technique based on simple image transform functions. Recent learned image compression models are vulnerable to adversarial attacks that result in poor compression rate, low reconstruction quality, or weird artifacts. To address the limitations, we propose a simple but effective two-way compression algorithm with random input transforms, which is conveniently applicable to existing image compression models. Unlike the naïve approaches, our approach preserves the original rate-distortion performance of the models on clean images. Moreover, the proposed algorithm requires no additional training or modification of existing models, making it more practical. We demonstrate the effectiveness of the proposed techniques through extensive experiments under multiple compression models, evaluation metrics, and attack scenarios.

A Training-Free Defense Framework for Robust Learned Image Compression

TL;DR

Learned image compression models are vulnerable to adversarial perturbations that can increase bitrate or distort reconstructions. The authors propose a training-free defense combining input randomization with a two-way compression scheme that selects the better encoding between the original and transformed inputs, using the rate-distortion objective to guide the choice, without retraining. The approach maintains performance on clean images while significantly improving robustness across multiple models and attack scenarios, including white-box and gray-box settings, and demonstrates generalization to FDA-style attacks. This practical defense can be applied to pretrained compressors with minimal overhead, offering a scalable solution for robust learned image compression.

Abstract

We study the robustness of learned image compression models against adversarial attacks and present a training-free defense technique based on simple image transform functions. Recent learned image compression models are vulnerable to adversarial attacks that result in poor compression rate, low reconstruction quality, or weird artifacts. To address the limitations, we propose a simple but effective two-way compression algorithm with random input transforms, which is conveniently applicable to existing image compression models. Unlike the naïve approaches, our approach preserves the original rate-distortion performance of the models on clean images. Moreover, the proposed algorithm requires no additional training or modification of existing models, making it more practical. We demonstrate the effectiveness of the proposed techniques through extensive experiments under multiple compression models, evaluation metrics, and attack scenarios.
Paper Structure (32 sections, 10 equations, 14 figures, 2 tables, 2 algorithms)

This paper contains 32 sections, 10 equations, 14 figures, 2 tables, 2 algorithms.

Figures (14)

  • Figure 1: Demonstration of the vulnerability of learned image compression model to adversarial attacks and effectiveness of our defense method. The yellow annotations in each reconstructed image denote bits per pixel (bpp)/PSNR (dB)/MS-SSIM.
  • Figure 2: Examples of adversarially perturbed images (top) and corresponding reconstructed images (bottom).
  • Figure 3: Results of adversarial attacks on image compression models for poor compression rates with various $\epsilon$ values for PGD algorithm. Top: results of low-bitrate models. Bottom: results of high-bitrate models. Clean denotes the performance on clean (i.e., unperturbed) images.
  • Figure 4: (a) Input randomization for image classification. (b), (c) Input randomization for encoder and decoder of image compression.
  • Figure 5: Performance degradation of an image compression model caused by a variety of input transforms.
  • ...and 9 more figures