Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

Manish Sharma; Jamison Heard; Eli Saber; Panos P. Markopoulos

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

Manish Sharma, Jamison Heard, Eli Saber, Panos P. Markopoulos

TL;DR

This work addresses CNN deployment under resource constraints by introducing Dynamic Parameter Rank Pruning (DPRP), which compresses networks during training through dynamically learned rank factors. By factorizing convolutional and dense layers with SVD and guiding the factors via a triad of losses that enforce application performance, orthogonality, and controlled rank reduction, the method achieves substantial parameter savings without post-training retraining. Empirical results across CIFAR-10/100 and ImageNet demonstrate that DPRP can maintain or improve accuracy while reducing trainable parameters and preserving computational efficiency, highlighting its practicality for edge environments. The approach advances model compression by enabling end-to-end learning of per-layer ranks, reducing manual rank selection and post-hoc pruning, and offering a scalable path toward deployment on resource-constrained devices.

Abstract

While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment. Low-rank matrix approximation has emerged as a promising approach to reduce CNN parameters, but its application presents challenges including rank selection and performance loss. To address these issues, we propose an efficient training method for CNN compression via dynamic parameter rank pruning. Our approach integrates efficient matrix factorization and novel regularization techniques, forming a robust framework for dynamic rank reduction and model compression. We use Singular Value Decomposition (SVD) to model low-rank convolutional filters and dense weight matrices and we achieve model compression by training the SVD factors with back-propagation in an end-to-end way. We evaluate our method on an array of modern CNNs, including ResNet-18, ResNet-20, and ResNet-32, and datasets like CIFAR-10, CIFAR-100, and ImageNet (2012), showcasing its applicability in computer vision. Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

TL;DR

Abstract

Paper Structure (23 sections, 17 equations, 6 figures, 5 tables)

This paper contains 23 sections, 17 equations, 6 figures, 5 tables.

Introduction
Related Work
Proposed Method
Notation and SVD Preliminaries
Factorized Convolutional and Fully-Connected Layer
Convolutional Layer
Fully-Connected Layer
Factor Initialization and Training
Application Loss
Structure Loss
Compression Loss
Model Compression
Experimentation
Datasets, Baseline Models, and Evaluation Metrics
Experimental Configuration
...and 8 more sections

Figures (6)

Figure 1: A typical convolutional layer.
Figure 2: A typical fully-connected layer.
Figure 3: The variation in factor sizes, represented by the initial rank $\theta$ and the final rank $\phi$.
Figure 4: The variation in the number of trainable parameters and FLOPS across epochs for the ResNet-20 proposed model using CIFAR-10 dataset.
Figure 5: Initial and final rank comparison for the ResNet-20 proposed model using the CIFAR-10 dataset. A smaller rank indicates a more compact layer with relatively fewer trainable parameters.
...and 1 more figures

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

TL;DR

Abstract

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)