Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

Zihao Mo; Yejiang Yang; Shuaizheng Lu; Weiming Xiang

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

Zihao Mo, Yejiang Yang, Shuaizheng Lu, Weiming Xiang

TL;DR

The paper addresses the accuracy loss that accompanies neural network compression by introducing an equivalence-evaluation framework that measures the output discrepancy between an original network $\Phi_1$ and a compressed one $\Phi_2$ using a merged network with a comparison layer, yielding a discrepancy vector $\mathbf{y}_{\mathrm{cmp}}$ and a reachable set $\tilde{\mathcal{Y}}^{\{L+1\}}$ to characterize equivalence via $\delta_{\max}$ and $\tilde{\delta}_{\max}$. It then formulates a repair strategy that avoids overfitting by iteratively updating the retraining targets with $\hat{\mathbf{y}}_{i,2}^{\{L\}} = \mathbf{y}_{i,2}^{\{L\}} + \frac{1}{\alpha} \tilde{\delta}_{\max}$ and retraining until the repaired network satisfies a prescribed target set $\mathcal{O}$. The approach is demonstrated on MNIST with a three-layer network compressed by quantization-aware training, where the original accuracy of $98\%$ is recovered from $91\%$ post-compression after repair, and discrepancies are reduced across different $\alpha$ values. This work provides a principled, verifiable method to repair compressed FNNs, enabling safer and more reliable deployment of compact models in practical applications.

Abstract

In this paper, we propose a method of repairing compressed Feedforward Neural Networks (FNNs) based on equivalence evaluation of two neural networks. In the repairing framework, a novel neural network equivalence evaluation method is developed to compute the output discrepancy between two neural networks. The output discrepancy can quantitatively characterize the output difference produced by compression procedures. Based on the computed output discrepancy, the repairing method first initializes a new training set for the compressed networks to narrow down the discrepancy between the two neural networks and improve the performance of the compressed network. Then, we repair the compressed FNN by re-training based on the training set. We apply our developed method to the MNIST dataset to demonstrate the effectiveness and advantages of our proposed repair method.

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

TL;DR

The paper addresses the accuracy loss that accompanies neural network compression by introducing an equivalence-evaluation framework that measures the output discrepancy between an original network

and a compressed one

using a merged network with a comparison layer, yielding a discrepancy vector

and a reachable set

to characterize equivalence via

and

. It then formulates a repair strategy that avoids overfitting by iteratively updating the retraining targets with

and retraining until the repaired network satisfies a prescribed target set

. The approach is demonstrated on MNIST with a three-layer network compressed by quantization-aware training, where the original accuracy of

is recovered from

post-compression after repair, and discrepancies are reduced across different

values. This work provides a principled, verifiable method to repair compressed FNNs, enabling safer and more reliable deployment of compact models in practical applications.

Abstract

Paper Structure (10 sections, 3 theorems, 23 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 3 theorems, 23 equations, 4 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Main Results
Equivalence Evaluation for Two FNNs
FNN Compression Repair
Application to Compressed Feedforward Neural Networks Repairment
Database
Experiment Set Up
Results
Conclusions

Key Result

Lemma 1

Consider two FFNs $\mathcal{N}_1$ and $\mathcal{N}_2$ under Assumption assumption_1, the following result holds for fully connected layers where $\tilde{\mathbf{W}}^{\{l\}} = \mathrm{diag}\{\mathbf{W}_{1}^{\{l\}},\mathbf{W}_{2}^{\{l\}}\}$ and $\tilde{\mathbf{b}}^{\{l\}} = [ (\mathbf{b}_{1}^{\{l\}})^{T} , (\mathbf{b}_{2}^{\{l\}})^{T} ]^{T}$ in which $\mathbf{W}_{1}^{\{l\}}$, $\mathbf{W}_{2}^{\{l\

Figures (4)

Figure 1: Framework of compressed feedforward neural network repair.
Figure 2: Repair results with a handwritten digit "9". Blue dots are the output for the original network $\Phi_1$. The green whisker line represents the output range of the compressed network $\Phi_2$ before repair. The red whisker line represents the output range of the compressed network $\hat{\Phi}_2$ after repair. The outcome shows that the repaired network generates a more precise output range (red whisker lines) closer to the original outputs (blue dots).
Figure 3: Repair result with a handwritten digit "9". Different color lines are the average discrepancy of input images between the original network $\Phi_1$ and compressed network $\Phi_2$ with different $\alpha$ settings. The outcome shows that different $\alpha$ may lead to different repair performance, but the repair process can always decrease the discrepancy.
Figure 4: Accuracy of the whole test set along with the repair process with different $\alpha$ settings. All $\alpha$ values can help the compressed network reach $98\%$ accuracy in 3 epochs.

Theorems & Definitions (11)

Definition 1
Remark 1
Remark 2
Lemma 1
proof
Lemma 2
proof
Remark 3
Theorem 1
proof
...and 1 more

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

TL;DR

Abstract

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (11)