End-to-End Anti-Backdoor Learning on Images and Time Series
Yujing Jiang, Xingjun Ma, Sarah Monazam Erfani, Yige Li, James Bailey
TL;DR
This work tackles the threat of backdoor attacks in deep neural networks for both image and time-series data by introducing End-to-End Anti-Backdoor Learning (E2ABL). E2ABL adds a second head attached to shallow layers to detect backdoor signals and cleanse poisoned samples during training, coupled with a true class recovery step to rectify labels. Across nine attacks and multiple datasets, E2ABL achieves higher clean accuracy and lower attack success rates than strong baselines, demonstrating robust, end-to-end defense without prior knowledge of the attack. The approach offers a practical, scalable baseline for secure training on poisoned data in real-world, safety-critical applications.
Abstract
Backdoor attacks present a substantial security concern for deep learning models, especially those utilized in applications critical to safety and security. These attacks manipulate model behavior by embedding a hidden trigger during the training phase, allowing unauthorized control over the model's output during inference time. Although numerous defenses exist for image classification models, there is a conspicuous absence of defenses tailored for time series data, as well as an end-to-end solution capable of training clean models on poisoned data. To address this gap, this paper builds upon Anti-Backdoor Learning (ABL) and introduces an innovative method, End-to-End Anti-Backdoor Learning (E2ABL), for robust training against backdoor attacks. Unlike the original ABL, which employs a two-stage training procedure, E2ABL accomplishes end-to-end training through an additional classification head linked to the shallow layers of a Deep Neural Network (DNN). This secondary head actively identifies potential backdoor triggers, allowing the model to dynamically cleanse these samples and their corresponding labels during training. Our experiments reveal that E2ABL significantly improves on existing defenses and is effective against a broad range of backdoor attacks in both image and time series domains.
