Physics Inspired Criterion for Pruning-Quantization Joint Learning
Weiying Xie, Xiaoyi Fan, Xin Zhang, Yunsong Li, Jie Lei, Leyuan Fang
TL;DR
This work tackles the challenge of deploying CNNs on resource-constrained devices by proposing PIC-PQ, a physics-inspired criterion for pruning-quantization joint learning. By drawing an analogy to elasticity dynamics, it defines a global filter importance through the relation $I_{i}^{l(i)} = a_{l(i)} \cdot FP(\boldsymbol{\Theta_{i}^{l(i)}}) + b_{l(i)}$ with $FP(\boldsymbol{\Theta_{i}^{l(i)}}) = R(\mathbf{o}_{i}^{l(i)}(D))$, enabling cross-layer ranking via a learned shift $b_{l(i)}$ and a module-internal deformation scale $a_{l(i)}$. The method derives a theoretically grounded objective and computes automatic per-layer bitwidths using layer sparsity and a hardware-aware penalty, solved efficiently with a Regularized Evolutionary Algorithm. Empirically, PIC-PQ delivers substantial BOPs reductions (e.g., up to ~54× on CIFAR10 with minimal accuracy loss and ~53× on ImageNet) while maintaining competitive or superior accuracy versus state-of-the-art methods, indicating strong potential for practical, interpretable model compression. The approach thus offers a principled, hardware-conscious path to jointly prune and quantize networks with interpretable global rankings and automatic bitwidth allocation.
Abstract
Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an analogy we first draw between elasticity dynamics (ED) and model compression (MC). Specifically, derived from Hooke's law in ED, we establish a linear relationship between the filters' importance distribution and the filter property (FP) by a learnable deformation scale in the physics inspired criterion (PIC). Furthermore, we extend PIC with a relative shift variable for a global view. To ensure feasibility and flexibility, available maximum bitwidth and penalty factor are introduced in quantization bitwidth assignment. Experiments on benchmarks of image classification demonstrate that PIC-PQ yields a good trade-off between accuracy and bit-operations (BOPs) compression ratio e.g., 54.96X BOPs compression ratio in ResNet56 on CIFAR10 with 0.10% accuracy drop and 53.24X in ResNet18 on ImageNet with 0.61% accuracy drop). The code will be available at https://github.com/fanxxxxyi/PIC-PQ.
