3S-Attack: Spatial, Spectral and Semantic Invisible Backdoor Attack Against DNN Models
Jianyao Yin, Luca Arnaboldi, Honglong Chen, Pascal Berrang, Mark Ryan
TL;DR
<3-5 sentence high-level summary> This paper introduces 3S-attack, a backdoor that remains stealthy across spatial, spectral, and semantic domains by extracting class-relevant semantic features with Grad-CAM from a lightweight preliminary model and embedding them spectrally via the DCT, with pixel-level constraints to preserve perceptual indistinguishability. It operates without access to the victim's training process, broadening realistic threat scenarios, and demonstrates strong attack success while maintaining high perceptual quality across diverse datasets. The work also analyzes robustness to parameter variations and evaluates defense resistance, showing that several state-of-the-art defenses struggle to detect or neutralize 3S-attack. These findings highlight vulnerabilities at the intersection of robustness and semantic interpretability and underscore the need for stronger, multi-domain defenses in AI systems.
Abstract
Backdoor attacks implant hidden behaviors into models by poisoning training data or modifying the model directly. These attacks aim to maintain high accuracy on benign inputs while causing misclassification when a specific trigger is present. While existing studies have explored stealthy triggers in spatial and spectral domains, few incorporate the semantic domain. In this paper, we propose 3S-attack, a novel backdoor attack which is stealthy across the spatial, spectral, and semantic domains. The key idea is to exploit the semantic features of benign samples as triggers, using Gradient-weighted Class Activation Mapping (Grad-CAM) and a preliminary model for extraction. Then we embedded the trigger in the spectral domain, followed by pixel-level restrictions in the spatial domain. This process minimizes the distance between poisoned and benign samples, making the attack harder to detect by existing defenses and human inspection. And it exposes a vulnerability at the intersection of robustness and semantic interpretability, revealing that models can be manipulated to act in semantically consistent yet malicious ways. Extensive experiments on various datasets, along with theoretical analysis, demonstrate the stealthiness of 3S-attack and highlight the need for stronger defenses to ensure AI security.
