Table of Contents
Fetching ...

Keep It Real: Challenges in Attacking Compression-Based Adversarial Purification

Samuel Räber, Till Aczel, Andreas Plesner, Roger Wattenhofer

TL;DR

The paper tackles the problem of defending image classifiers against adversarial perturbations via compression-based purification, arguing that realism in reconstructed images is the key to robustness rather than gradient masking. It develops a set of adaptive, attack- tuned defenses against learned and traditional compression models, and demonstrates that high-realism reconstructions substantially raise the bar for attackers across multiple threat models and architectures. The study shows that realism preserves natural-image distribution while suppressing adversarial noise, achieving meaningful robustness with lower computational cost than diffusion-based approaches. The findings highlight realism as a central objective for future security evaluations and motivate the design of attacks that specifically target realism-based defenses, with practical implications for deploying robust preprocessing pipelines.

Abstract

Previous work has suggested that preprocessing images through lossy compression can defend against adversarial perturbations, but comprehensive attack evaluations have been lacking. In this paper, we construct strong white-box and adaptive attacks against various compression models and identify a critical challenge for attackers: high realism in reconstructed images significantly increases attack difficulty. Through rigorous evaluation across multiple attack scenarios, we demonstrate that compression models capable of producing realistic, high-fidelity reconstructions are substantially more resistant to our attacks. In contrast, low-realism compression models can be broken. Our analysis reveals that this is not due to gradient masking. Rather, realistic reconstructions maintaining distributional alignment with natural images seem to offer inherent robustness. This work highlights a significant obstacle for future adversarial attacks and suggests that developing more effective techniques to overcome realism represents an essential challenge for comprehensive security evaluation.

Keep It Real: Challenges in Attacking Compression-Based Adversarial Purification

TL;DR

The paper tackles the problem of defending image classifiers against adversarial perturbations via compression-based purification, arguing that realism in reconstructed images is the key to robustness rather than gradient masking. It develops a set of adaptive, attack- tuned defenses against learned and traditional compression models, and demonstrates that high-realism reconstructions substantially raise the bar for attackers across multiple threat models and architectures. The study shows that realism preserves natural-image distribution while suppressing adversarial noise, achieving meaningful robustness with lower computational cost than diffusion-based approaches. The findings highlight realism as a central objective for future security evaluations and motivate the design of attacks that specifically target realism-based defenses, with practical implications for deploying robust preprocessing pipelines.

Abstract

Previous work has suggested that preprocessing images through lossy compression can defend against adversarial perturbations, but comprehensive attack evaluations have been lacking. In this paper, we construct strong white-box and adaptive attacks against various compression models and identify a critical challenge for attackers: high realism in reconstructed images significantly increases attack difficulty. Through rigorous evaluation across multiple attack scenarios, we demonstrate that compression models capable of producing realistic, high-fidelity reconstructions are substantially more resistant to our attacks. In contrast, low-realism compression models can be broken. Our analysis reveals that this is not due to gradient masking. Rather, realistic reconstructions maintaining distributional alignment with natural images seem to offer inherent robustness. This work highlights a significant obstacle for future adversarial attacks and suggests that developing more effective techniques to overcome realism represents an essential challenge for comprehensive security evaluation.

Paper Structure

This paper contains 43 sections, 8 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Decrease in robust accuracy when employing a compression defense with reduced realism under different perturbation budgets. Incorporating realism substantially increases the difficulty of successful attacks.
  • Figure 2: Overview of compression-based adversarial defense. An input image (potentially containing adversarial perturbations) is first processed by the defense module, which consists of an encoder-decoder architecture that compresses and reconstructs the image. This reconstructed image is then passed to a classifier, which outputs class probabilities. The defense aims to indirectly, through the compression process, remove adversarial noise while preserving the semantic content needed for correct classification.
  • Figure 3: Loss landscapes under successful 100-step PGD attacks on a ResNet with CRDR defense. Left: Attacking the classifier directly. Middle: Attacking with low realism defense. Right: Attacking high realism defense. The standard deviations of the loss surfaces are 0.0544, 0.3343, and 0.3156, respectively. Increasing realism does not make the loss landscape spikier, indicating that it does not contribute to gradient masking.
  • Figure 4: Three different diffusion outputs for the same input image. These showcase the large differences the diffusion models can introduce and thus what the adversarial noise must be robust to.
  • Figure 5: Accuracy of the adversarial purification defense for 100 iterations of PGD. The blue curve shows the accuracy for every iteration, and the orange curve shows the fraction of images that have never been misclassified up to that iteration.
  • ...and 3 more figures