Table of Contents
Fetching ...

Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective

Nan Li, Haiyang Yu, Ping Yi

TL;DR

This work tackles backdoor mitigation in deep neural networks by reframing pruning as an optimization problem. It introduces Optimized Neuron Pruning ONP which constructs graphs from neuron connections and employs Graph Neural Networks with Reinforcement Learning to learn pruning policies that remove backdoor neurons while preserving clean accuracy $CA$. ONP demonstrates state of the art mitigation across multiple attacks and datasets with limited defense data, and extends pruning strategies to handle residual connections via group based pruning. The findings suggest optimization driven pruning can effectively expose and erase backdoors with practical data efficiency and broad architectural applicability, offering a promising direction for robust backdoor defenses.

Abstract

Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks, posing concerning threats to their reliable deployment. Recent research reveals that backdoors can be erased from infected DNNs by pruning a specific group of neurons, while how to effectively identify and remove these backdoor-associated neurons remains an open challenge. Most of the existing defense methods rely on defined rules and focus on neuron's local properties, ignoring the exploration and optimization of pruning policies. To address this gap, we propose an Optimized Neuron Pruning (ONP) method combined with Graph Neural Network (GNN) and Reinforcement Learning (RL) to repair backdoor models. Specifically, ONP first models the target DNN as graphs based on neuron connectivity, and then uses GNN-based RL agents to learn graph embeddings and find a suitable pruning policy. To the best of our knowledge, this is the first attempt to employ GNN and RL for optimizing pruning policies in the field of backdoor defense. Experiments show, with a small amount of clean data, ONP can effectively prune the backdoor neurons implanted by a set of backdoor attacks at the cost of negligible performance degradation, achieving a new state-of-the-art performance for backdoor mitigation.

Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective

TL;DR

This work tackles backdoor mitigation in deep neural networks by reframing pruning as an optimization problem. It introduces Optimized Neuron Pruning ONP which constructs graphs from neuron connections and employs Graph Neural Networks with Reinforcement Learning to learn pruning policies that remove backdoor neurons while preserving clean accuracy . ONP demonstrates state of the art mitigation across multiple attacks and datasets with limited defense data, and extends pruning strategies to handle residual connections via group based pruning. The findings suggest optimization driven pruning can effectively expose and erase backdoors with practical data efficiency and broad architectural applicability, offering a promising direction for robust backdoor defenses.

Abstract

Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks, posing concerning threats to their reliable deployment. Recent research reveals that backdoors can be erased from infected DNNs by pruning a specific group of neurons, while how to effectively identify and remove these backdoor-associated neurons remains an open challenge. Most of the existing defense methods rely on defined rules and focus on neuron's local properties, ignoring the exploration and optimization of pruning policies. To address this gap, we propose an Optimized Neuron Pruning (ONP) method combined with Graph Neural Network (GNN) and Reinforcement Learning (RL) to repair backdoor models. Specifically, ONP first models the target DNN as graphs based on neuron connectivity, and then uses GNN-based RL agents to learn graph embeddings and find a suitable pruning policy. To the best of our knowledge, this is the first attempt to employ GNN and RL for optimizing pruning policies in the field of backdoor defense. Experiments show, with a small amount of clean data, ONP can effectively prune the backdoor neurons implanted by a set of backdoor attacks at the cost of negligible performance degradation, achieving a new state-of-the-art performance for backdoor mitigation.
Paper Structure (35 sections, 8 equations, 6 figures, 2 tables)

This paper contains 35 sections, 8 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: BLC and CLC values of neurons in a backdoored ResNet18
  • Figure 2: Neuron connections and an example of a constructed graph. (a)Backdoor and clean neurons primarily connect with neurons of the same type in the previous layer. (b)Part of the graph constructed for the second block of ResNet18, where red nodes represent potential backdoor neurons with large BLC values
  • Figure 3: Overview of our proposed ONP. ONP converts the infected model into graphs by neuron connections to exploit the inherent similarities among backdoor neurons, and then employs a RL agent containing GNN to learn from the graph and optimize the pruning policy
  • Figure 4: Channel relevance due to residual connections and the corresponding pruning strategy. (a)Activation map of the last two residual blocks in an infected ResNet 18 with poisoned samples as input, only 144 neurons are selected for simplicity (b)Group-based pruning strategy for the last two residual block in ResNet 18
  • Figure 5: Pruning policies for the last convolutional layer of a backdoored ResNet18, derived from 4 different defense methods
  • ...and 1 more figures