Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi, Jayaraman J. Thiagarajan, Ruben Glatt, Shusen Liu
TL;DR
This work investigates how neural representations can parameterize pre-trained CNN weights to improve both accuracy and predictor-parameter efficiency. It reveals that a reconstruction-only objective can recover or exceed original accuracy and introduces a two-phase, decoupled training scheme (reconstruction followed by distillation) plus an inception-like progressive reconstruction to boost performance and compression. The use of high-capacity teachers during the distillation phase further enhances the compression–accuracy trade-off, enabling significant predictor-size reductions while maintaining or surpassing baseline performance across multiple datasets. These findings offer a practical pathway to deploying weight-predictor schemes that simultaneously enhance model quality and storage efficiency, with potential extensions to diverse architectures and data-free settings.
Abstract
In this work, we investigate the fundamental trade-off regarding accuracy and parameter efficiency in the parameterization of neural network weights using predictor networks. We present a surprising finding that, when recovering the original model accuracy is the sole objective, it can be achieved effectively through the weight reconstruction objective alone. Additionally, we explore the underlying factors for improving weight reconstruction under parameter-efficiency constraints, and propose a novel training scheme that decouples the reconstruction objective from auxiliary objectives such as knowledge distillation that leads to significant improvements compared to state-of-the-art approaches. Finally, these results pave way for more practical scenarios, where one needs to achieve improvements on both model accuracy and predictor network parameter-efficiency simultaneously.
