Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son
TL;DR
This work tackles the data-free universal adversarial perturbation (UAP) problem by introducing PSP-UAP, which extracts pseudo-semantic priors directly from the evolving UAP to generate region-based semantic samples without target-domain data. By incorporating input transformations and a sample reweighting scheme guided by KL-divergence, PSP-UAP significantly enhances black-box transferability across diverse CNN architectures on ImageNet, achieving state-of-the-art performance among data-free methods and competitive results relative to data-dependent UAPs. The approach reveals that semantic cues embedded within the UAP can be leveraged as a data substitute, enabling robust universal attacks in data-scarce scenarios. While effective for CNNs, the method shows limited transferability to ViT-based models, pointing to future work on devising black-box strategies for transformer architectures and further defenses against such data-free perturbations.
Abstract
Data-free Universal Adversarial Perturbation (UAP) is an image-agnostic adversarial attack that deceives deep neural networks using a single perturbation generated solely from random noise without relying on data priors. However, traditional data-free UAP methods often suffer from limited transferability due to the absence of semantic content in random noise. To address this issue, we propose a novel data-free universal attack method that recursively extracts pseudo-semantic priors directly from the UAPs during training to enrich the semantic content within the data-free UAP framework. Our approach effectively leverages latent semantic information within UAPs via region sampling, enabling successful input transformations-typically ineffective in traditional data-free UAP methods due to the lack of semantic cues-and significantly enhancing black-box transferability. Furthermore, we introduce a sample reweighting technique to mitigate potential imbalances from random sampling and transformations, emphasizing hard examples less affected by the UAPs. Comprehensive experiments on ImageNet show that our method achieves state-of-the-art performance in average fooling rate by a substantial margin, notably improves attack transferability across various CNN architectures compared to existing data-free UAP methods, and even surpasses data-dependent UAP methods. Code is available at: https://github.com/ChnanChan/PSP-UAP.
