Table of Contents
Fetching ...

FACL-Attack: Frequency-Aware Contrastive Learning for Transferable Adversarial Attacks

Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon

TL;DR

This work tackles the problem of transferable adversarial attacks under strict black-box conditions by leveraging frequency-domain techniques. It introduces two training-time modules, Frequency-Aware Domain Randomization (FADR) and Frequency-Augmented Contrastive Learning (FACL), to diversify domain-specific information while aligning domain-invariant mid-band features. Through spectral decomposition via DCT and a band-specific contrastive objective, the approach yields a perturbation generator that transfers effectively across unseen domains and models without increasing inference cost. Extensive cross-domain and cross-model experiments on ImageNet-1K and several target datasets demonstrate state-of-the-art transferability, while ablation and robustness analyses validate the complementary roles of FADR and FACL. The method has practical implications for evaluating and understanding adversarial transferability in realistic black-box scenarios.

Abstract

Deep neural networks are known to be vulnerable to security risks due to the inherent transferable nature of adversarial examples. Despite the success of recent generative model-based attacks demonstrating strong transferability, it still remains a challenge to design an efficient attack strategy in a real-world strict black-box setting, where both the target domain and model architectures are unknown. In this paper, we seek to explore a feature contrastive approach in the frequency domain to generate adversarial examples that are robust in both cross-domain and cross-model settings. With that goal in mind, we propose two modules that are only employed during the training phase: a Frequency-Aware Domain Randomization (FADR) module to randomize domain-variant low- and high-range frequency components and a Frequency-Augmented Contrastive Learning (FACL) module to effectively separate domain-invariant mid-frequency features of clean and perturbed image. We demonstrate strong transferability of our generated adversarial perturbations through extensive cross-domain and cross-model experiments, while keeping the inference time complexity.

FACL-Attack: Frequency-Aware Contrastive Learning for Transferable Adversarial Attacks

TL;DR

This work tackles the problem of transferable adversarial attacks under strict black-box conditions by leveraging frequency-domain techniques. It introduces two training-time modules, Frequency-Aware Domain Randomization (FADR) and Frequency-Augmented Contrastive Learning (FACL), to diversify domain-specific information while aligning domain-invariant mid-band features. Through spectral decomposition via DCT and a band-specific contrastive objective, the approach yields a perturbation generator that transfers effectively across unseen domains and models without increasing inference cost. Extensive cross-domain and cross-model experiments on ImageNet-1K and several target datasets demonstrate state-of-the-art transferability, while ablation and robustness analyses validate the complementary roles of FADR and FACL. The method has practical implications for evaluating and understanding adversarial transferability in realistic black-box scenarios.

Abstract

Deep neural networks are known to be vulnerable to security risks due to the inherent transferable nature of adversarial examples. Despite the success of recent generative model-based attacks demonstrating strong transferability, it still remains a challenge to design an efficient attack strategy in a real-world strict black-box setting, where both the target domain and model architectures are unknown. In this paper, we seek to explore a feature contrastive approach in the frequency domain to generate adversarial examples that are robust in both cross-domain and cross-model settings. With that goal in mind, we propose two modules that are only employed during the training phase: a Frequency-Aware Domain Randomization (FADR) module to randomize domain-variant low- and high-range frequency components and a Frequency-Augmented Contrastive Learning (FACL) module to effectively separate domain-invariant mid-frequency features of clean and perturbed image. We demonstrate strong transferability of our generated adversarial perturbations through extensive cross-domain and cross-model experiments, while keeping the inference time complexity.
Paper Structure (46 sections, 9 equations, 10 figures, 15 tables, 1 algorithm)

This paper contains 46 sections, 9 equations, 10 figures, 15 tables, 1 algorithm.

Figures (10)

  • Figure 1: To boost the transferability of adversarial examples, we exploit band-specific characteristics of natural images in the frequency domain. Our method randomizes domain-variant low- and high-band frequency components (FCs) in the data space, and contrasts domain-invariant mid-range clean and perturbed feature pairs in the feature space.
  • Figure 2: Overview of FACL-Attack. From the clean input image, our FADR module outputs the augmented image after spectral transformation, which is targeted to randomize only the domain-variant low/high FCs. The perturbation generator $G_{\theta}(\cdot)$ then produces the $l_{\infty}$-budget bounded adversarial image $\boldsymbol{\mathit{x}}'_{s}$ with perturbation projector $P(\cdot)$ from the randomized image. The resulting clean and adversarial image pairs are decomposed into mid-band (domain-agnostic) and low/high-band (domain-specific) FCs, whose features $f_{k}(\cdot)$ extracted from the $k$-th layer of the surrogate model are contrasted in our FACL module to boost the adversarial transferability. The adversarial image $\boldsymbol{\mathit{x}}'_{s}$ is colorized only for visualization.
  • Figure 3: Visualization of spectral transformation in FADR. From the clean input image (column 1), our FADR decomposes the image into mid-band (column 2) and low/high-band (column 3) FCs. The FADR only randomizes the low/high-band FCs, yielding the augmented output in column 4. Here we demonstrate transformations with large hyper-parameters of $\rho=0.5$ and $\sigma=8$ for visualization.
  • Figure 4: Clean image, unbounded adversarial images from baseline and FACL, and the final difference map (Diff(baseline, baseline+FACL)), from left to right. Our generated adversarial perturbations are more focused on domain-agnostic semantic region such as shape, facilitating more transferable attack.
  • Figure 5: Average cross-domain evaluation results across various frequency thresholds.
  • ...and 5 more figures