Towards Cross-Domain Multi-Targeted Adversarial Attacks

Taïga Gonçalves; Tomo Miyazaki; Shinichiro Omachi

Towards Cross-Domain Multi-Targeted Adversarial Attacks

Taïga Gonçalves, Tomo Miyazaki, Shinichiro Omachi

TL;DR

CD-MTA tackles cross-domain targeted adversarial attacks without access to the victim's training data by conditioning perturbations on a single target image and guiding the generator with class-agnostic feature objectives. It introduces a Feature Injection Module (FIM) that blends source and target features using SPADE, and it enforces a dual objective: L_feat to align intermediate features and L_fr to reconstruct target features, formalized through a loss like min_delta ell( f(x_s+delta), f(x_t) ) with a budget constraint on ||delta||. The approach eliminates data leakage and demonstrates state-of-the-art performance on unseen target classes across ImageNet and seven additional datasets, including cross-domain transfers, without target-domain training. These findings reveal critical security concerns for privately trained models and motivate the development of stronger defenses against leakage-free, cross-domain targeted attacks.

Abstract

Multi-targeted adversarial attacks aim to mislead classifiers toward specific target classes using a single perturbation generator with a conditional input specifying the desired target class. Existing methods face two key limitations: (1) a single generator supports only a limited number of predefined target classes, and (2) it requires access to the victim model's training data to learn target class semantics. This dependency raises data leakage concerns in practical black-box scenarios where the training data is typically private. To address these limitations, we propose a novel Cross-Domain Multi-Targeted Attack (CD-MTA) that can generate perturbations toward arbitrary target classes, even those that do not exist in the attacker's training data. CD-MTA is trained on a single public dataset but can perform targeted attacks on black-box models trained on different datasets with disjoint and unknown class sets. Our method requires only a single example image that visually represents the desired target class, without relying its label, class distribution or pretrained embeddings. We achieve this through a Feature Injection Module (FIM) and class-agnostic objectives which guide the generator to extract transferable, fine-grained features from the target image without inferring class semantics. Experiments on ImageNet and seven additional datasets show that CD-MTA outperforms existing multi-targeted attack methods on unseen target classes in black-box and cross-domain scenarios. The code is available at https://github.com/tgoncalv/CD-MTA.

Towards Cross-Domain Multi-Targeted Adversarial Attacks

TL;DR

Abstract

Towards Cross-Domain Multi-Targeted Adversarial Attacks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)