Physics Inspired Criterion for Pruning-Quantization Joint Learning

Weiying Xie; Xiaoyi Fan; Xin Zhang; Yunsong Li; Jie Lei; Leyuan Fang

Physics Inspired Criterion for Pruning-Quantization Joint Learning

Weiying Xie, Xiaoyi Fan, Xin Zhang, Yunsong Li, Jie Lei, Leyuan Fang

TL;DR

This work tackles the challenge of deploying CNNs on resource-constrained devices by proposing PIC-PQ, a physics-inspired criterion for pruning-quantization joint learning. By drawing an analogy to elasticity dynamics, it defines a global filter importance through the relation $I_{i}^{l(i)} = a_{l(i)} \cdot FP(\boldsymbol{\Theta_{i}^{l(i)}}) + b_{l(i)}$ with $FP(\boldsymbol{\Theta_{i}^{l(i)}}) = R(\mathbf{o}_{i}^{l(i)}(D))$, enabling cross-layer ranking via a learned shift $b_{l(i)}$ and a module-internal deformation scale $a_{l(i)}$. The method derives a theoretically grounded objective and computes automatic per-layer bitwidths using layer sparsity and a hardware-aware penalty, solved efficiently with a Regularized Evolutionary Algorithm. Empirically, PIC-PQ delivers substantial BOPs reductions (e.g., up to ~54× on CIFAR10 with minimal accuracy loss and ~53× on ImageNet) while maintaining competitive or superior accuracy versus state-of-the-art methods, indicating strong potential for practical, interpretable model compression. The approach thus offers a principled, hardware-conscious path to jointly prune and quantize networks with interpretable global rankings and automatic bitwidth allocation.

Abstract

Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an analogy we first draw between elasticity dynamics (ED) and model compression (MC). Specifically, derived from Hooke's law in ED, we establish a linear relationship between the filters' importance distribution and the filter property (FP) by a learnable deformation scale in the physics inspired criterion (PIC). Furthermore, we extend PIC with a relative shift variable for a global view. To ensure feasibility and flexibility, available maximum bitwidth and penalty factor are introduced in quantization bitwidth assignment. Experiments on benchmarks of image classification demonstrate that PIC-PQ yields a good trade-off between accuracy and bit-operations (BOPs) compression ratio e.g., 54.96X BOPs compression ratio in ResNet56 on CIFAR10 with 0.10% accuracy drop and 53.24X in ResNet18 on ImageNet with 0.61% accuracy drop). The code will be available at https://github.com/fanxxxxyi/PIC-PQ.

Physics Inspired Criterion for Pruning-Quantization Joint Learning

TL;DR

with

, enabling cross-layer ranking via a learned shift

and a module-internal deformation scale

. The method derives a theoretically grounded objective and computes automatic per-layer bitwidths using layer sparsity and a hardware-aware penalty, solved efficiently with a Regularized Evolutionary Algorithm. Empirically, PIC-PQ delivers substantial BOPs reductions (e.g., up to ~54× on CIFAR10 with minimal accuracy loss and ~53× on ImageNet) while maintaining competitive or superior accuracy versus state-of-the-art methods, indicating strong potential for practical, interpretable model compression. The approach thus offers a principled, hardware-conscious path to jointly prune and quantize networks with interpretable global rankings and automatic bitwidth allocation.

Abstract

Paper Structure (23 sections, 21 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 23 sections, 21 equations, 6 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Pruning-Quantization Joint Learning
Bridging CNNs with Knowledge of Mathematics or Physics
Method
The Analogy between ED and MC
Physics Inspired Criterion for Ranking Filters Globally
The Derivation from a Mathematical Theory Perspective
Quantization Bitwidth Automatically Assignment
Obtain the Compressed Model
Experiments
Experiment Setting
Implementation Details
Evaluation Metrics
Experiments and Comparisons
...and 8 more sections

Figures (6)

Figure 1: The analogy between the optimization in ED and MC.
Figure 2: Overview of PIC-PQ framework. (a) Step 1 first shows a visual analogy between elastomers' deformation in ED and filters' importance distribution in MC. And it establishes a linear relationship that the filters' importance distribution is linearly related to FP. In step 2, a relative shift variable is further introduced to rank filters cross-layers globally. (b) Then, the rank of the feature map from each filter is generated for obtaining the global importance ranking with the optimal $\boldsymbol{a-b}$ pairs searched before. (c) Finally, compression policy is assigned automatically.
Figure 3: Lipshitz bound on the function. $k$ denotes the slope.
Figure 4: Two-dimensional feature representation of the inner layers of the ResNet56 network trained on CIFAR10. (a) Conv1 (first layer before Block 1) (b) Conv19 (last layer of Block 1) (c) Conv37 (last layer of Block 2) (d) Conv55 (last layer of Block 3).
Figure 5: Influence of the $\boldsymbol{\tau}$ when compressing ResNet56 on CIFAR100.
...and 1 more figures

Physics Inspired Criterion for Pruning-Quantization Joint Learning

TL;DR

Abstract

Physics Inspired Criterion for Pruning-Quantization Joint Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)