Reduced storage direct tensor ring decomposition for convolutional neural networks compression
Mateusz Gabor, Rafał Zdunek
TL;DR
This work tackles the growing memory and compute demands of CNNs by introducing Reduced storage direct Tensor Ring decomposition (RSDTR), a TR-based compression scheme that optimizes storage by selecting the best circular permutation of the kernel weight tensor at a fixed accuracy. By representing convolutional kernels as a 4-core TR chain and restructuring the convolution as a four-sublayer pipeline, RSDTR achieves simultaneous parameter and FLOPS reductions while enabling fine-tuning from decomposed factors. The method outperforms tensorized TR variants and many pruning-based approaches on CIFAR-10 and ImageNet across multiple architectures, with particularly favorable accuracy-retention at high compression. Practical impact includes enabling efficient CNN deployment on edge devices and informing future work on higher-order CNNs and hybrid compression strategies that combine tensor networks with pruning.
Abstract
Convolutional neural networks (CNNs) are among the most widely used machine learning models for computer vision tasks, such as image classification. To improve the efficiency of CNNs, many CNNs compressing approaches have been developed. Low-rank methods approximate the original convolutional kernel with a sequence of smaller convolutional kernels, which leads to reduced storage and time complexities. In this study, we propose a novel low-rank CNNs compression method that is based on reduced storage direct tensor ring decomposition (RSDTR). The proposed method offers a higher circular mode permutation flexibility, and it is characterized by large parameter and FLOPS compression rates, while preserving a good classification accuracy of the compressed network. The experiments, performed on the CIFAR-10 and ImageNet datasets, clearly demonstrate the efficiency of RSDTR in comparison to other state-of-the-art CNNs compression approaches.
