Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning
Joseph Bingham, Sam Helmich
TL;DR
Convolutional neural networks have grown in size, driving high compute and power demands. The paper presents Combine, a criterion-based pruning framework that standardizes how pruning criteria are defined and applied, enabling fast, iterative pruning and easy comparison across methods. It demonstrates that different criterion functions affect models differently and introduces novel criteria, achieving up to 79% filter pruning with up to 68% FLOP reduction while preserving or improving accuracy on MNIST and CIFAR-10. This work highlights how pruning can be integrated into model design, offering substantial practical speedups, especially for edge deployments, without requiring new hardware. Overall, Combine provides a practical, extensible path to efficient CNNs through criterion-driven pruning.
Abstract
As the need for more accurate and powerful Convolutional Neural Networks (CNNs) increases, so too does the size, execution time, memory footprint, and power consumption. To overcome this, solutions such as pruning have been proposed with their own metrics and methodologies, or criteria, for how weights should be removed. These solutions do not share a common implementation and are difficult to implement and compare. In this work, we introduce Combine, a criterion- based pruning solution and demonstrate that it is fast and effective framework for iterative pruning, demonstrate that criterion have differing effects on different models, create a standard language for comparing criterion functions, and propose a few novel criterion functions. We show the capacity of these criterion functions and the framework on VGG inspired models, pruning up to 79\% of filters while retaining or improving accuracy, and reducing the computations needed by the network by up to 68\%.
