Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

Chao Wang; Shuren Qi; Zhiqiu Huang; Yushu Zhang; Rushi Lan; Xiaochun Cao; Feng-Lei Fan

Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

Chao Wang, Shuren Qi, Zhiqiu Huang, Yushu Zhang, Rushi Lan, Xiaochun Cao, Feng-Lei Fan

TL;DR

This work tackles adversarial perturbations in image classification and DLaaS by introducing a spatial-frequency discriminative detector built on a mid-scale Krawtchouk decomposition. The detector leverages secret-key randomization to enhance security against defense-aware attacks and uses frequency-band integration with a simple SVM classifier to achieve strong discriminability across diverse datasets and attacks. Key contributions include (i) a Krawtchouk-based decomposition that captures both spatial and frequency cues, (ii) a secrecy mechanism via random feature selection, and (iii) extensive cross-dataset/model/attack validation demonstrating robust performance and transferability. The approach has practical implications for real-world DLaaS security, offering a competitive, configurable, and scalable detector that operates without modifying the underlying models.

Abstract

The vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community. From a security perspective, it poses a critical risk for modern vision systems, e.g., the popular Deep Learning as a Service (DLaaS) frameworks. For protecting deep models while not modifying them, current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data. However, these decompositions are either biased towards frequency resolution or spatial resolution, thus failing to capture adversarial patterns comprehensively. Also, when the detector relies on few fixed features, it is practical for an adversary to fool the model while evading the detector (i.e., defense-aware attack). Motivated by such facts, we propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition. It expands the above works from two aspects: 1) the introduced Krawtchouk basis provides better spatial-frequency discriminability, capturing the differences between natural and adversarial data comprehensively in both spatial and frequency distributions, w.r.t. the common trigonometric or wavelet basis; 2) the extensive features formed by the Krawtchouk decomposition allows for adaptive feature selection and secrecy mechanism, significantly increasing the difficulty of the defense-aware attack, w.r.t. the detector with few fixed features. Theoretical and numerical analyses demonstrate the uniqueness and usefulness of our detector, exhibiting competitive scores on several deep models and image sets against a variety of adversarial attacks.

Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

TL;DR

Abstract

Paper Structure (34 sections, 14 equations, 12 figures, 5 tables)

This paper contains 34 sections, 14 equations, 12 figures, 5 tables.

Introduction
State of the Arts
Motivations
Contributions
General Formulation
Model Formulation
Attack Formulation
Defense Formulation
Towards Accurate and Secure Detector: Formulation
Discriminability Analysis
Security Analysis
Towards Accurate and Secure Detector: Methodology
Overview
Spatial-frequency Discriminative Decomposition with Secret Keys
Discriminability Analysis
...and 19 more sections

Figures (12)

Figure 1: Illustration for the training phase of the proposed adversarial example detector. The detector is trained on a set of adversarial/clean image examples along with corresponding labels. The detector consists of three main steps: 1) the image is projected into a space defined by Krawtchouk polynomials, where the frequency parameters $(n,m)$ and spatial parameters $({P_x},{P_y})$ are determined by key; 2) the obtained coefficients are integrated and enhanced to form a compact-but-expressive feature vector by certain beneficial priors; 3) such features are fed into an SVM for the prediction, which is the only learning part in the detector.
Figure 2: Illustration for the inference phase of the proposed adversarial example detector. When the detector is trained (i.e., with the key and learned SVM parameters), it can be deployed in various real-world scenarios, where a DLaaS scenario is chosen as an example. For an image under analysis, our detector predicts whether it contains adversarial perturbations. With such prediction, the DLaaS is able to deny the service when the adversarial perturbation is revealed.
Figure 3: Illustration for the weighted Krawtchouk polynomials ${\bar{K}_l}(z;P,L)$ with $l = \{ 2,4,8\}$, $P = \{ 0.25,0.5,0.75\}$, and $L = 100$. Note that the number and location of zeros of $\bar{K}$ can be adjusted explicitly by $l$ and $P$, respectively, meaning the time-frequency discriminability of the represented image information.
Figure 4: Illustration for some datasets and adversarial attacks involved in experiments.
Figure 5: The benchmarking of adversarial perturbation detection accuracy for different detectors on the MNIST dataset.
...and 7 more figures

Theorems & Definitions (4)

Conjecture 1
Conjecture 2
Definition 1
Definition 2

Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

TL;DR

Abstract

Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

Authors

TL;DR

Abstract

Table of Contents

Figures (12)

Theorems & Definitions (4)