Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection
Youheng Sun, Shengming Yuan, Xuanhan Wang, Lianli Gao, Jingkuan Song
TL;DR
The paper tackles the problem of generalized targeted adversarial attacks that can mislead DNNs toward any target class, including unseen ones. It proposes GAKer, a generalized adversarial attacker that performs latent infection by injecting target object components into the latent space of a clean image via a UNet-based generator and cosine-feature losses in a frozen extractor space. Key contributions include firstly enabling attacks on unknown targets with substantially improved success rates (e.g., approximately $14.13\%$ higher for unknown classes and $4.23\%$ for known classes over prior generator-based methods), and secondly demonstrating effectiveness across standard CNNs, adversarially trained models, and large vision-language models, with efficient training relative to prior methods. The approach provides a practical tool for broad vulnerability assessment and reveals pervasive weaknesses in contemporary DNNs and LVLMs, underscoring security implications for real-world systems.
Abstract
Targeted adversarial attack, which aims to mislead a model to recognize any image as a target object by imperceptible perturbations, has become a mainstream tool for vulnerability assessment of deep neural networks (DNNs). Since existing targeted attackers only learn to attack known target classes, they cannot generalize well to unknown classes. To tackle this issue, we propose $\bf{G}$eneralized $\bf{A}$dversarial attac$\bf{KER}$ ($\bf{GAKer}$), which is able to construct adversarial examples to any target class. The core idea behind GAKer is to craft a latently infected representation during adversarial example generation. To this end, the extracted latent representations of the target object are first injected into intermediate features of an input image in an adversarial generator. Then, the generator is optimized to ensure visual consistency with the input image while being close to the target object in the feature space. Since the GAKer is class-agnostic yet model-agnostic, it can be regarded as a general tool that not only reveals the vulnerability of more DNNs but also identifies deficiencies of DNNs in a wider range of classes. Extensive experiments have demonstrated the effectiveness of our proposed method in generating adversarial examples for both known and unknown classes. Notably, compared with other generative methods, our method achieves an approximately $14.13\%$ higher attack success rate for unknown classes and an approximately $4.23\%$ higher success rate for known classes. Our code is available in https://github.com/VL-Group/GAKer.
