Training-Free Restoration of Pruned Neural Networks
Keonho Lee, Minsoo Kim, Dong-Wan Choi
TL;DR
This work tackles restoring pruned CNNs without access to training data or additional fine-tuning. It introduces Leave Before You Leave (LBYL), a data-free recovery strategy that distributes the information of each pruned filter across multiple preserved filters via a delivery matrix, leading to a data-free reconstruction loss with a convex, closed-form solution. The reconstruction error decomposes into Residual Error, Batch Normalization Error, and Activation Error, and is minimized with a regularized least-squares solution that yields a unique optimum for the recovery coefficients. Empirically, LBYL consistently surpasses one-to-one compensation (NM) and other data-free baselines across CIFAR-10/100, ImageNet, and COCO, including transfer scenarios, demonstrating improved reconstruction quality and practical utility when data or fine-tuning are unavailable.
Abstract
Although network pruning has been highly popularized to compress deep neural networks, its resulting accuracy heavily depends on a fine-tuning process that is often computationally expensive and requires the original data. However, this may not be the case in real-world scenarios, and hence a few recent works attempt to restore pruned networks without any expensive retraining process. Their strong assumption is that every neuron being pruned can be replaced with another one quite similar to it, but unfortunately this does not hold in many neural networks, where the similarity between neurons is extremely low in some layers. In this article, we propose a more rigorous and robust method of restoring pruned networks in a fine-tuning free and data-free manner, called LBYL (Leave Before You Leave). LBYL significantly relaxes the aforementioned assumption in a way that each pruned neuron leaves its pieces of information to as many preserved neurons as possible and thereby multiple neurons together obtain a more robust approximation to the original output of the neuron who just left. Our method is based on a theoretical analysis on how to formulate the reconstruction error between the original network and its approximation, which nicely leads to a closed form solution for our derived loss function. Through the extensive experiments, LBYL is confirmed to be indeed more effective to approximate the original network and consequently able to achieve higher accuracy for restored networks, compared to the recent approaches exploiting the similarity between two neurons. The very first version of this work, which contains major technical and theoretical components, was submitted to NeurIPS 2021 and ICML 2022.
