Data-Free Dynamic Compression of CNNs for Tractable Efficiency
Lukas Meiner, Jens Mehnert, Alexandru Paul Condurache
TL;DR
The paper addresses the high computational cost of CNNs by proposing HASTE, a data-free, plug-and-play module that dynamically compresses channel depth at test time using locality-sensitive hashing with sparse random projections. By grouping similar latent channels and leveraging the distributive property of convolution, HASTE merges redundant inputs and filters, achieving substantial FLOPs reductions without any training or data access. The approach introduces a tunable hyperparameter set, notably the number of hash hyperplanes $L$, to trade off accuracy and efficiency, and demonstrates strong results on CIFAR-10 (e.g., ResNet34 with 46.72% FLOPs reduction and 1.25% accuracy loss) and ImageNet (up to 31.54% FLOPs reduction for WideResNet101), with scalability to deeper and wider models. This data-free, dynamic pruning has practical impact for edge devices and federated settings, enabling real-time adjustment of model complexity without retraining or data availability.
Abstract
To reduce the computational cost of convolutional neural networks (CNNs) on resource-constrained devices, structured pruning approaches have shown promise in lowering floating-point operations (FLOPs) without substantial drops in accuracy. However, most methods require fine-tuning or specific training procedures to achieve a reasonable trade-off between retained accuracy and reduction in FLOPs, adding computational overhead and requiring training data to be available. To this end, we propose HASTE (Hashing for Tractable Efficiency), a data-free, plug-and-play convolution module that instantly reduces a network's test-time inference cost without training or fine-tuning. Our approach utilizes locality-sensitive hashing (LSH) to detect redundancies in the channel dimension of latent feature maps, compressing similar channels to reduce input and filter depth simultaneously, resulting in cheaper convolutions. We demonstrate our approach on the popular vision benchmarks CIFAR-10 and ImageNet, where we achieve a 46.72% reduction in FLOPs with only a 1.25% loss in accuracy by swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
