Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles

Dong Lao; Yuxiang Zhang; Haniyeh Ehsani Oskouie; Yangchao Wu; Alex Wong; Stefano Soatto

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles

Dong Lao, Yuxiang Zhang, Haniyeh Ehsani Oskouie, Yangchao Wu, Alex Wong, Stefano Soatto

TL;DR

The paper addresses the vulnerability of deep nets to adversarial perturbations by introducing a test-time defense based on stochastic resonance in latent space. It applies small integer-pixel translations to inputs and ensembles the transformed embeddings via a latent-space push-forward/inverse operation, yielding a training-free, architecture-agnostic defense. The approach delivers state-of-the-art robustness on image classification and extends to dense prediction tasks like stereo matching and optical flow, including against adaptive attacks, demonstrating practical applicability. Overall, the method offers a universal, training-free defense that can be tuned for computational budget and scales across tasks and architectures.

Abstract

We propose a test-time defense mechanism against adversarial attacks: imperceptible image perturbations that significantly alter the predictions of a model. Unlike existing methods that rely on feature filtering or smoothing, which can lead to information loss, we propose to "combat noise with noise" by leveraging stochastic resonance to enhance robustness while minimizing information loss. Our approach introduces small translational perturbations to the input image, aligns the transformed feature embeddings, and aggregates them before mapping back to the original reference image. This can be expressed in a closed-form formula, which can be deployed on diverse existing network architectures without introducing additional network modules or fine-tuning for specific attack types. The resulting method is entirely training-free, architecture-agnostic, and attack-agnostic. Empirical results show state-of-the-art robustness on image classification and, for the first time, establish a generic test-time defense for dense prediction tasks, including stereo matching and optical flow, highlighting the method's versatility and practicality. Specifically, relative to clean (unperturbed) performance, our method recovers up to 68.1% of the accuracy loss on image classification, 71.9% on stereo matching, and 29.2% on optical flow under various types of adversarial attacks.

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles

TL;DR

Abstract

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)