Table of Contents
Fetching ...

Deep Convolutional Neural Networks Structured Pruning via Gravity Regularization

Abdesselam Ferdi

TL;DR

This work introduces gravity-based regularization for structured pruning in DCNNs, integrating a gravity penalty into training to redistribute convolutional filter weights around an attracting filter based on filter mass $m_n = \|W_{n,l}\|_1$ and distance $d_{n,l}$. The gravity term $F_{n,l} = G\frac{m_{1,l}m_{n,l}}{d_{n,l}^2}$ added to the cost with coefficient $\alpha_g$ biases near filters to retain nonzero weights while distant filters are driven toward zero, enabling pruning after training without architectural changes. Experiments on CIFAR with ResNet-56 and VGG-19 demonstrate competitive pruning performance, though with higher training overhead due to the gravity penalty; fine-tuned results indicate robust post-pruning accuracy at various pruning ratios. The method presents a practical, adaptive approach to accelerate DCNNs by reducing FLOPs and parameters while preserving essential information, with potential for broader applicability beyond CIFAR benchmarks.

Abstract

Structured pruning is a widely employed strategy for accelerating deep convolutional neural networks (DCNNs). However, existing methods often necessitate modifications to the original architectures, involve complex implementations, and require lengthy fine-tuning stages. To address these challenges, we propose a novel physics-inspired approach that integrates the concept of gravity into the training stage of DCNNs. In this approach, the gravity is directly proportional to the product of the masses of the convolution filter and the attracting filter, and inversely proportional to the square of the distance between them. We applied this force to the convolution filters, either drawing filters closer to the attracting filter (experiencing weaker gravity) toward non-zero weights or pulling filters farther away (subject to stronger gravity) toward zero weights. As a result, filters experiencing stronger gravity have their weights reduced to zero, enabling their removal, while filters under weaker gravity retain significant weights and preserve important information. Our method simultaneously optimizes the filter weights and ranks their importance, eliminating the need for complex implementations or extensive fine-tuning. We validated the proposed approach on popular DCNN architectures using the CIFAR dataset, achieving competitive results compared to existing methods.

Deep Convolutional Neural Networks Structured Pruning via Gravity Regularization

TL;DR

This work introduces gravity-based regularization for structured pruning in DCNNs, integrating a gravity penalty into training to redistribute convolutional filter weights around an attracting filter based on filter mass and distance . The gravity term added to the cost with coefficient biases near filters to retain nonzero weights while distant filters are driven toward zero, enabling pruning after training without architectural changes. Experiments on CIFAR with ResNet-56 and VGG-19 demonstrate competitive pruning performance, though with higher training overhead due to the gravity penalty; fine-tuned results indicate robust post-pruning accuracy at various pruning ratios. The method presents a practical, adaptive approach to accelerate DCNNs by reducing FLOPs and parameters while preserving essential information, with potential for broader applicability beyond CIFAR benchmarks.

Abstract

Structured pruning is a widely employed strategy for accelerating deep convolutional neural networks (DCNNs). However, existing methods often necessitate modifications to the original architectures, involve complex implementations, and require lengthy fine-tuning stages. To address these challenges, we propose a novel physics-inspired approach that integrates the concept of gravity into the training stage of DCNNs. In this approach, the gravity is directly proportional to the product of the masses of the convolution filter and the attracting filter, and inversely proportional to the square of the distance between them. We applied this force to the convolution filters, either drawing filters closer to the attracting filter (experiencing weaker gravity) toward non-zero weights or pulling filters farther away (subject to stronger gravity) toward zero weights. As a result, filters experiencing stronger gravity have their weights reduced to zero, enabling their removal, while filters under weaker gravity retain significant weights and preserve important information. Our method simultaneously optimizes the filter weights and ranks their importance, eliminating the need for complex implementations or extensive fine-tuning. We validated the proposed approach on popular DCNN architectures using the CIFAR dataset, achieving competitive results compared to existing methods.

Paper Structure

This paper contains 18 sections, 10 equations, 2 figures, 4 tables, 2 algorithms.

Figures (2)

  • Figure 1: An illustration of gravity-based training: A convolutional layer comprising ten filters is presented as an example. The color shading indicates the mass of each filter, with the attracting filter assigned the largest mass. Initially, the convolutional layer contains filters initialized with either random or pretrained weights. Filters located farther from the attracting filter experience stronger gravitational forces, driving their weights toward zero. In contrast, filters in closer proximity encounter weaker forces, thereby pulling their weights toward non-zero values.
  • Figure 2: Top-1 accuracy of pruned ResNet-56 and VGG-19 models, initialized with pretrained weights and trained with gravity at five distinct gravity rates, without any fine-tuning.