Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

Hanxiao Lu; Zeyu Huang; Ren Wang

Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

Hanxiao Lu, Zeyu Huang, Ren Wang

TL;DR

The paper tackles the challenge of purifying CNNs contaminated by natural and backdoor-style noises, providing an exact recovery guarantee for one-hidden-layer ReLU CNNs under overparameterization. It introduces an L1-based robust recovery framework that projects contaminated weights onto carefully designed subspaces via $A_W$ and $A_\beta$, enabling accurate reconstruction of $W$ and $\beta$ with high probability. Theoretical guarantees (Theorems 1 and 2) are supplemented by empirical validation on synthetic data, MNIST, and CIFAR-10, including demonstrations of backdoor attack mitigation using limited benign data. The work further shows promising extensions to multi-layer CNNs through experiments, highlighting its potential as a practical defense against model poisoning with minimal data requirements.

Abstract

Convolutional neural networks (CNNs), one of the key architectures of deep learning models, have achieved superior performance on many machine learning tasks such as image classification, video recognition, and power systems. Despite their success, CNNs can be easily contaminated by natural noises and artificially injected noises such as backdoor attacks. In this paper, we propose a robust recovery method to remove the noise from the potentially contaminated CNNs and provide an exact recovery guarantee on one-hidden-layer non-overlapping CNNs with the rectified linear unit (ReLU) activation function. Our theoretical results show that both CNNs' weights and biases can be exactly recovered under the overparameterization setting with some mild assumptions. The experimental results demonstrate the correctness of the proofs and the effectiveness of the method in both the synthetic environment and the practical neural network setting. Our results also indicate that the proposed method can be extended to multiple-layer CNNs and potentially serve as a defense strategy against backdoor attacks.

Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

TL;DR

and

, enabling accurate reconstruction of

and

with high probability. Theoretical guarantees (Theorems 1 and 2) are supplemented by empirical validation on synthetic data, MNIST, and CIFAR-10, including demonstrations of backdoor attack mitigation using limited benign data. The work further shows promising extensions to multi-layer CNNs through experiments, highlighting its potential as a practical defense against model poisoning with minimal data requirements.

Abstract

Paper Structure (21 sections, 12 theorems, 154 equations, 13 figures, 2 algorithms)

This paper contains 21 sections, 12 theorems, 154 equations, 13 figures, 2 algorithms.

Introduction
Problem formulation
CNN model
Corrupted model
Purification of One-hidden-Layer CNN Algorithm
CNN model training
Robust recovery for CNN purification
Design Matrix of hidden layer $A_W$
Design Matrix of output layer $A_{\beta}$
Theoretical Recovery Guarantee
Experiment
Experiments on synthetic data
Experiments on MNIST
Experiments on CIFAR-10
Poisoning attack mitigation
...and 6 more sections

Key Result

Lemma 1

Assume that $\frac{mn}{k}$ ($\frac{mn}{\sqrt{p}},\frac{nlog(mn)}{k}$) is sufficiently small, following upper and lower bounds hold for $A=A_{W}$ ($A=A_{\beta}$) with some constants $\sigma^2$, $\underline{\lambda}$, and $\bar{\lambda}$. where $|A|$ is the column number of $A$, and $D_{A}$ is the dimension of $A_i$ . $c_1,\cdots,c_{|A|}$ are fixed values satisfying $\max_s|c_i|\leq 1$ . $A$ is eit

Figures (13)

Figure 1: Conceptual diagram illustrating the proposed framework. Hidden-layer weights $W$ and output layer weights $\beta$ of a convolutional neural network (CNN) are contaminated by noises. The proposed CNN purification method can remove noises from contaminated weights.
Figure 2: Increasing $p$ and $k$ promotes the recovery performance ($n = 5, m=5$) on synthetic data. Experiments under settings in Theorem \ref{['thm: main1']}. When $p$ increases, the limit of $\epsilon$ for successful recovery of $\beta$ also increases. When $k$ increases, the limit of $\epsilon$ for successful recovery of $W$ increases.
Figure 3: Increasing $p$ and $k$ promotes the recovery performance ($n = 5, m=5$) on synthetic data Experiments under setting in Theorem \ref{['thm: main2']} setting. When $p$ increases, the limit of $\epsilon$ for successful recovery of $\beta$ also increases. When $k$ increases, the limit of $\epsilon$ for successful recovery of $W$ increases.
Figure 4: Decreasing $m$ promotes the recovery performance ($n = 5, p = 500, k = 200$) on synthetic data. Experiments under settings in Theorem \ref{['thm: main1']} setting. When $m$ decreases, the limit of $\epsilon$ for successful recovery of both $W$ and $\beta$ also increases.
Figure 5: Increasing $p$ promotes the recovery performance ($n = 21$) on MNIST dataset. Experiments under the setting in Theorem \ref{['thm: main1']}. When $p$ increases, the limit of $\epsilon$ for successful recovery of $\beta$ also increases
...and 8 more figures

Theorems & Definitions (18)

Lemma 1
proof : Proof of Lemma \ref{['lemma: conditions']}
Theorem 1
proof : Proof of Theorem \ref{['thm: bound_iteration']}
Theorem 2
proof : Proof of Theorem \ref{['thm: main1']}
Theorem 3
proof : Proof of Theorem \ref{['thm: main2']}
Lemma 2
Lemma 3
...and 8 more

Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

TL;DR

Abstract

Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (18)