Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case
Hanxiao Lu, Zeyu Huang, Ren Wang
TL;DR
The paper tackles the challenge of purifying CNNs contaminated by natural and backdoor-style noises, providing an exact recovery guarantee for one-hidden-layer ReLU CNNs under overparameterization. It introduces an L1-based robust recovery framework that projects contaminated weights onto carefully designed subspaces via $A_W$ and $A_\beta$, enabling accurate reconstruction of $W$ and $\beta$ with high probability. Theoretical guarantees (Theorems 1 and 2) are supplemented by empirical validation on synthetic data, MNIST, and CIFAR-10, including demonstrations of backdoor attack mitigation using limited benign data. The work further shows promising extensions to multi-layer CNNs through experiments, highlighting its potential as a practical defense against model poisoning with minimal data requirements.
Abstract
Convolutional neural networks (CNNs), one of the key architectures of deep learning models, have achieved superior performance on many machine learning tasks such as image classification, video recognition, and power systems. Despite their success, CNNs can be easily contaminated by natural noises and artificially injected noises such as backdoor attacks. In this paper, we propose a robust recovery method to remove the noise from the potentially contaminated CNNs and provide an exact recovery guarantee on one-hidden-layer non-overlapping CNNs with the rectified linear unit (ReLU) activation function. Our theoretical results show that both CNNs' weights and biases can be exactly recovered under the overparameterization setting with some mild assumptions. The experimental results demonstrate the correctness of the proofs and the effectiveness of the method in both the synthetic environment and the practical neural network setting. Our results also indicate that the proposed method can be extended to multiple-layer CNNs and potentially serve as a defense strategy against backdoor attacks.
