REPrune: Channel Pruning via Kernel Representative Selection
Mincheol Park, Dongjin Kim, Cheonjun Park, Yuna Park, Gyeong Eun Gong, Won Woo Ro, Suhyun Kim
TL;DR
REPrune tackles the challenge of heavy pruning granularity in channel pruning by analyzing kernels at a finer, per-channel level. It uses agglomerative clustering with Ward linkage to identify representative kernels within each input channel and then solves a greedy maximum cluster coverage problem to select filters that cover these representatives, enabling immediate acceleration within a concurrent training-pruning pipeline. The method demonstrates strong accuracy retention at high FLOPs reductions across image recognition and object detection benchmarks, outperforming several channel-, clustering-, and kernel-pruning baselines and offering training-time efficiency gains. This approach offers a practical path to deploy highly pruned CNNs on general-purpose hardware without a separate finetuning stage, potentially accelerating real-world CV workloads on both data-center GPUs and edge devices.
Abstract
Channel pruning is widely accepted to accelerate modern convolutional neural networks (CNNs). The resulting pruned model benefits from its immediate deployment on general-purpose software and hardware resources. However, its large pruning granularity, specifically at the unit of a convolution filter, often leads to undesirable accuracy drops due to the inflexibility of deciding how and where to introduce sparsity to the CNNs. In this paper, we propose REPrune, a novel channel pruning technique that emulates kernel pruning, fully exploiting the finer but structured granularity. REPrune identifies similar kernels within each channel using agglomerative clustering. Then, it selects filters that maximize the incorporation of kernel representatives while optimizing the maximum cluster coverage problem. By integrating with a simultaneous training-pruning paradigm, REPrune promotes efficient, progressive pruning throughout training CNNs, avoiding the conventional train-prune-finetune sequence. Experimental results highlight that REPrune performs better in computer vision tasks than existing methods, effectively achieving a balance between acceleration ratio and performance retention.
