Beyond Classification: Evaluating Diffusion Denoised Smoothing for Security-Utility Trade off
Yury Belousov, Brian Pulfer, Vitaliy Kinakh, Slava Voloshynovskiy
TL;DR
This work investigates diffusion denoising smoothing as a defense for Vision Foundation Models across multiple downstream tasks, highlighting a persistent security–utility trade-off. By evaluating three adversarial attacks under varying diffusion noise levels and attack-time configurations, the study shows that high-noise diffusion can substantially shield models yet degrade clean performance, while low-noise diffusion preserves accuracy but offers limited protection. A diffusion-targeted attack further reveals vulnerabilities, demonstrating that defenses can be circumvented under certain adaptive attack regimes. Overall, the results indicate that while diffusion-based defenses are promising, achieving robust adversarial protection without sacrificing task performance remains an open challenge for security-critical VFMs.
Abstract
While foundation models demonstrate impressive performance across various tasks, they remain vulnerable to adversarial inputs. Current research explores various approaches to enhance model robustness, with Diffusion Denoised Smoothing emerging as a particularly promising technique. This method employs a pretrained diffusion model to preprocess inputs before model inference. Yet, its effectiveness remains largely unexplored beyond classification. We aim to address this gap by analyzing three datasets with four distinct downstream tasks under three different adversarial attack algorithms. Our findings reveal that while foundation models maintain resilience against conventional transformations, applying high-noise diffusion denoising to clean images without any distortions significantly degrades performance by as high as 57%. Low-noise diffusion settings preserve performance but fail to provide adequate protection across all attack types. Moreover, we introduce a novel attack strategy specifically targeting the diffusion process itself, capable of circumventing defenses in the low-noise regime. Our results suggest that the trade-off between adversarial robustness and performance remains a challenge to be addressed.
