ReVeil: Unconstrained Concealed Backdoor Attack on Deep Neural Networks using Machine Unlearning
Manaar Alam, Hithem Lamri, Michail Maniatakos
TL;DR
ReVeil introduces a data-collection–phase concealed backdoor attack that requires no access to the target model or auxiliary data. By injecting camouflage samples—poisoned samples perturbed with isotropic Gaussian noise—into the training data, it significantly lowers pre-deployment ASR while preserving backdoor potential, enabling stealth against common defenses. After deployment, an exact unlearning process removes camouflage and restores high ASR, with BA largely unaffected, demonstrating practical viability and resilience across multiple datasets and triggers. The work discusses extensions to multi-target backdoors, approximate unlearning, and potential defenses, highlighting important implications for ML security and data privacy.
Abstract
Backdoor attacks embed hidden functionalities in deep neural networks (DNN), triggering malicious behavior with specific inputs. Advanced defenses monitor anomalous DNN inferences to detect such attacks. However, concealed backdoors evade detection by maintaining a low pre-deployment attack success rate (ASR) and restoring high ASR post-deployment via machine unlearning. Existing concealed backdoors are often constrained by requiring white-box or black-box access or auxiliary data, limiting their practicality when such access or data is unavailable. This paper introduces ReVeil, a concealed backdoor attack targeting the data collection phase of the DNN training pipeline, requiring no model access or auxiliary data. ReVeil maintains low pre-deployment ASR across four datasets and four trigger patterns, successfully evades three popular backdoor detection methods, and restores high ASR post-deployment through machine unlearning.
