Optimizing Convolutional Neural Network Architecture

Luis Balderas; Miguel Lastra; José M. Benítez

Optimizing Convolutional Neural Network Architecture

Luis Balderas, Miguel Lastra, José M. Benítez

TL;DR

The paper tackles the challenge of deploying CNNs on resource-limited devices by introducing OCNNA, a pruning-and-knowledge-distillation framework that ranks convolutional filters by importance. It uses a three-stage, PCA-based pipeline (P_m^{l}=PCA(o_m^{l}), F_m^{l}=||P_m^{l}||_{F}, C_m=CV(F_m^{l})) to quantify per-filter significance and retains the top $k$-th percentile to form a streamlined model, transferring knowledge from the original network. Thorough experiments on CIFAR-10/100 and ImageNet across VGG-16, ResNet-50, DenseNet-40, and MobileNet show that OC NNA achieves strong accuracy with substantial parameter reduction, outperforming or remaining competitive with more than 20 state-of-the-art pruning methods. The approach supports end-to-end compression with minimal tuning and is well-suited for IoT and edge deployments where energy and memory are critical constraints.

Abstract

Convolutional Neural Networks (CNN) are widely used to face challenging tasks like speech recognition, natural language processing or computer vision. As CNN architectures get larger and more complex, their computational requirements increase, incurring significant energetic costs and challenging their deployment on resource-restricted devices. In this paper, we propose Optimizing Convolutional Neural Network Architecture (OCNNA), a novel CNN optimization and construction method based on pruning and knowledge distillation designed to establish the importance of convolutional layers. The proposal has been evaluated though a thorough empirical study including the best known datasets (CIFAR-10, CIFAR-100 and Imagenet) and CNN architectures (VGG-16, ResNet-50, DenseNet-40 and MobileNet), setting Accuracy Drop and Remaining Parameters Ratio as objective metrics to compare the performance of OCNNA against the other state-of-art approaches. Our method has been compared with more than 20 convolutional neural network simplification algorithms obtaining outstanding results. As a result, OCNNA is a competitive CNN constructing method which could ease the deployment of neural networks into IoT or resource-limited devices.

Optimizing Convolutional Neural Network Architecture

TL;DR

-th percentile to form a streamlined model, transferring knowledge from the original network. Thorough experiments on CIFAR-10/100 and ImageNet across VGG-16, ResNet-50, DenseNet-40, and MobileNet show that OC NNA achieves strong accuracy with substantial parameter reduction, outperforming or remaining competitive with more than 20 state-of-the-art pruning methods. The approach supports end-to-end compression with minimal tuning and is well-suited for IoT and edge deployments where energy and memory are critical constraints.

Abstract

Paper Structure (20 sections, 8 equations, 3 figures, 9 tables, 1 algorithm)

This paper contains 20 sections, 8 equations, 3 figures, 9 tables, 1 algorithm.

Introduction
Previous work
Neuroevolution
Neural Architecture Search
Quantization
Knowledge Distillation
Pruning
Proposal
Notation
OCNNA: the algorithm
Implementation
Empirical Evaluation
Common architectures and datasets
Metrics
Compared state-of-art approaches
...and 5 more sections

Figures (3)

Figure 1: OCNNA applied to VGG-16. Given the output from the $i$-th convolutional layer evaluated in a validation set, PCA, Frobenius norm and Coefficient of Variation are applied in order to measure the most significant filters. The $k$-th percentile of filters, in terms of importance, are selected, generating a new model which $i$-th convolutional layer is a optimized version of the original one. This approach is applied to every convolutional filter.
Figure 2: Application of OCNNA to one convolutional filter. As we can see, the filter generates partial information compared to the whole layer output. Applying OCNNA to the filter's output, our method provides a single number for this filter which reflects its importance. This process is iterated over all filters from a layer and the $k$-th percentile of them in terms of significance will form part of the new model.
Figure 3: Sensitivity study of $k$ percentile of significance value for ResNet-50 and Imagenet dataset. Left Y-axis shows the Test Accuracy and Right Y-axis shows the remaining parameters ratio. The base accuracy is $75.4\%$. As we can see, when $k=40$ (40-th percentile), we obtain a significant reduction of parameters (remaining $37.44\%$) with an accuracy drop of $0.57\%$.

Optimizing Convolutional Neural Network Architecture

TL;DR

Abstract

Optimizing Convolutional Neural Network Architecture

Authors

TL;DR

Abstract

Table of Contents

Figures (3)