Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation
Zihao Mo, Yejiang Yang, Shuaizheng Lu, Weiming Xiang
TL;DR
The paper addresses the accuracy loss that accompanies neural network compression by introducing an equivalence-evaluation framework that measures the output discrepancy between an original network $\Phi_1$ and a compressed one $\Phi_2$ using a merged network with a comparison layer, yielding a discrepancy vector $\mathbf{y}_{\mathrm{cmp}}$ and a reachable set $\tilde{\mathcal{Y}}^{\{L+1\}}$ to characterize equivalence via $\delta_{\max}$ and $\tilde{\delta}_{\max}$. It then formulates a repair strategy that avoids overfitting by iteratively updating the retraining targets with $\hat{\mathbf{y}}_{i,2}^{\{L\}} = \mathbf{y}_{i,2}^{\{L\}} + \frac{1}{\alpha} \tilde{\delta}_{\max}$ and retraining until the repaired network satisfies a prescribed target set $\mathcal{O}$. The approach is demonstrated on MNIST with a three-layer network compressed by quantization-aware training, where the original accuracy of $98\%$ is recovered from $91\%$ post-compression after repair, and discrepancies are reduced across different $\alpha$ values. This work provides a principled, verifiable method to repair compressed FNNs, enabling safer and more reliable deployment of compact models in practical applications.
Abstract
In this paper, we propose a method of repairing compressed Feedforward Neural Networks (FNNs) based on equivalence evaluation of two neural networks. In the repairing framework, a novel neural network equivalence evaluation method is developed to compute the output discrepancy between two neural networks. The output discrepancy can quantitatively characterize the output difference produced by compression procedures. Based on the computed output discrepancy, the repairing method first initializes a new training set for the compressed networks to narrow down the discrepancy between the two neural networks and improve the performance of the compressed network. Then, we repair the compressed FNN by re-training based on the training set. We apply our developed method to the MNIST dataset to demonstrate the effectiveness and advantages of our proposed repair method.
