AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

Siqi Li; Jun Chen; Jingyang Xiang; Chengrui Zhu; Yong Liu

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

Siqi Li, Jun Chen, Jingyang Xiang, Chengrui Zhu, Yong Liu

TL;DR

This paper proposes the Automatic Data-Free Pruning (AutoDFP) method, a method that achieves automatic pruning and reconstruction without fine-tuning based on the assumption that the loss of information can be partially compensated by retaining focused information from similar channels.

Abstract

Structured pruning methods are developed to bridge the gap between the massive scale of neural networks and the limited hardware resources. Most current structured pruning methods rely on training datasets to fine-tune the compressed model, resulting in high computational burdens and being inapplicable for scenarios with stringent requirements on privacy and security. As an alternative, some data-free methods have been proposed, however, these methods often require handcraft parameter tuning and can only achieve inflexible reconstruction. In this paper, we propose the Automatic Data-Free Pruning (AutoDFP) method that achieves automatic pruning and reconstruction without fine-tuning. Our approach is based on the assumption that the loss of information can be partially compensated by retaining focused information from similar channels. Specifically, We formulate data-free pruning as an optimization problem, which can be effectively addressed through reinforcement learning. AutoDFP assesses the similarity of channels for each layer and provides this information to the reinforcement learning agent, guiding the pruning and reconstruction process of the network. We evaluate AutoDFP with multiple networks on multiple datasets, achieving impressive compression results. For instance, on the CIFAR-10 dataset, AutoDFP demonstrates a 2.87\% reduction in accuracy loss compared to the recently proposed data-free pruning method DFPC with fewer FLOPs on VGG-16. Furthermore, on the ImageNet dataset, AutoDFP achieves 43.17\% higher accuracy than the SOTA method with the same 80\% preserved ratio on MobileNet-V1.

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

TL;DR

Abstract

Paper Structure (26 sections, 22 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 22 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related Works
Automatic Network Pruning.
Date-Free Pruning.
Problem Formulation
Background and Notation
Reconstruction Assumption
Problem Definition
Methodology
Layer-wise Reconstruction
Markov Decision Process
Solution via Reinforcement Learning
Reward function
Action Space
State Space
...and 11 more sections

Figures (7)

Figure 1: The overview of AutoDFP. The upper part of the figure displays the outcome of pruning the $\ell^{th}$ layer using the conventional method with the constant pruning rate $\bar{p}$. As a result, the $(\ell+1)^{th}$ layer acquires a damaged feature map $\hat{Z}^{(\ell+1)}$. The bottom part of the figure demonstrates the procedure of pruning the $\ell^{th}$ layer and reconstructing the $(\ell+1)^{th}$ layer with the AutoDFP method. Both the specially designed pruning ratio $p_l$ and the reconstruction within the purple box are guided by a reinforcement learning agent, which ultimately generates the restored feature map $\tilde{Z}^{(\ell+1)}$.
Figure 2: T-SNE visualization of the results of DBSCAN clustering of channels in a certain layer of the network, the points of different colors represent different clusters. Left: The clustering result of channels of a certain layer in the VGG-16, the red points represent noise points. Right: The clustering result of a certain layer in the ResNet-101.
Figure 3: The framework on which reinforcement learning works in AutoDFP. As depicted in the blue box, AutoDFP utilizes the DBSCAN clustering algorithm and the bias matrix to perform the channel similarity evaluation for each layer of the original model. The state containing the aforementioned information will be furnished to a Soft Actor-Critic agent, which will subsequently produce two continuous actions, namely $p_\ell$ and $\lambda_\ell$. These actions are then utilized to direct the network's pruning and reconstruction procedures.
Figure 4: The value of $\mathcal{P}_{B<t}$ and the reconstruction strategy of ResNet-50 on the ImageNet dataset. The purple line represents the reconstruction strategy $\lambda_\ell$ of each layer given by the reinforcement learning agent, and the green line represents a component $\mathcal{P}_{B<t}$ in the state $s_\ell$ given to the agent.
Figure 5: The value of $C_{noise}$ and the pruning strategy of VGG-16 on the CIFAR-10 dataset. The purple broken line represents the pruning strategy $p_\ell$ of each layer given by the agent, and the yellow broken line represents a component $C_{noise}$ in the state $s_\ell$ given to the agent.
...and 2 more figures

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

TL;DR

Abstract

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

Authors

TL;DR

Abstract

Table of Contents

Figures (7)