Table of Contents
Fetching ...

Reduced storage direct tensor ring decomposition for convolutional neural networks compression

Mateusz Gabor, Rafał Zdunek

TL;DR

This work tackles the growing memory and compute demands of CNNs by introducing Reduced storage direct Tensor Ring decomposition (RSDTR), a TR-based compression scheme that optimizes storage by selecting the best circular permutation of the kernel weight tensor at a fixed accuracy. By representing convolutional kernels as a 4-core TR chain and restructuring the convolution as a four-sublayer pipeline, RSDTR achieves simultaneous parameter and FLOPS reductions while enabling fine-tuning from decomposed factors. The method outperforms tensorized TR variants and many pruning-based approaches on CIFAR-10 and ImageNet across multiple architectures, with particularly favorable accuracy-retention at high compression. Practical impact includes enabling efficient CNN deployment on edge devices and informing future work on higher-order CNNs and hybrid compression strategies that combine tensor networks with pruning.

Abstract

Convolutional neural networks (CNNs) are among the most widely used machine learning models for computer vision tasks, such as image classification. To improve the efficiency of CNNs, many CNNs compressing approaches have been developed. Low-rank methods approximate the original convolutional kernel with a sequence of smaller convolutional kernels, which leads to reduced storage and time complexities. In this study, we propose a novel low-rank CNNs compression method that is based on reduced storage direct tensor ring decomposition (RSDTR). The proposed method offers a higher circular mode permutation flexibility, and it is characterized by large parameter and FLOPS compression rates, while preserving a good classification accuracy of the compressed network. The experiments, performed on the CIFAR-10 and ImageNet datasets, clearly demonstrate the efficiency of RSDTR in comparison to other state-of-the-art CNNs compression approaches.

Reduced storage direct tensor ring decomposition for convolutional neural networks compression

TL;DR

This work tackles the growing memory and compute demands of CNNs by introducing Reduced storage direct Tensor Ring decomposition (RSDTR), a TR-based compression scheme that optimizes storage by selecting the best circular permutation of the kernel weight tensor at a fixed accuracy. By representing convolutional kernels as a 4-core TR chain and restructuring the convolution as a four-sublayer pipeline, RSDTR achieves simultaneous parameter and FLOPS reductions while enabling fine-tuning from decomposed factors. The method outperforms tensorized TR variants and many pruning-based approaches on CIFAR-10 and ImageNet across multiple architectures, with particularly favorable accuracy-retention at high compression. Practical impact includes enabling efficient CNN deployment on edge devices and informing future work on higher-order CNNs and hybrid compression strategies that combine tensor networks with pruning.

Abstract

Convolutional neural networks (CNNs) are among the most widely used machine learning models for computer vision tasks, such as image classification. To improve the efficiency of CNNs, many CNNs compressing approaches have been developed. Low-rank methods approximate the original convolutional kernel with a sequence of smaller convolutional kernels, which leads to reduced storage and time complexities. In this study, we propose a novel low-rank CNNs compression method that is based on reduced storage direct tensor ring decomposition (RSDTR). The proposed method offers a higher circular mode permutation flexibility, and it is characterized by large parameter and FLOPS compression rates, while preserving a good classification accuracy of the compressed network. The experiments, performed on the CIFAR-10 and ImageNet datasets, clearly demonstrate the efficiency of RSDTR in comparison to other state-of-the-art CNNs compression approaches.
Paper Structure (22 sections, 1 theorem, 17 equations, 12 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 1 theorem, 17 equations, 12 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{Y} = {\tt TR}(\mathcal{G}^{(1)}, \mathcal{G}^{(2)}, \ldots, \mathcal{G}^{(N)})$ be the TR format of $\mathcal{Y} \in \mathbb R^{I_1 \times \ldots \times I_N}$, and let $\mathcal{Y}^{\tau_k} \in \mathbb R^{I_{k+1} \times \ldots \times I_N \times I_1 \times \ldots \times I_k}$ express th

Figures (12)

  • Figure 1: TR decompositions of all circular mode-permutations of the kernel weight tensor.
  • Figure 2: Visual representation of the proposed new layer constructed from the decomposed factors in Pytorch library for circular shift $\tau_0$.
  • Figure 3: Ratio $\rho$ versus rank $R$ in the range of $[1,30]$ for all types of convolutional layers in the ResNet-32 network..
  • Figure 4: Fine-tuning curves for compressed ResNet-18 and ResNet-34 by RSDTR method analyzed on the ImageNet dataset.
  • Figure 5: Results of ResNet-20 compression on CIFAR-10.
  • ...and 7 more figures

Theorems & Definitions (8)

  • Definition 1: Tensor contraction
  • Definition 2: Multi-mode tensor contraction
  • Definition 3: Circular shift (mickelin2020algorithms)
  • Definition 4: TR model (zhao2016tensor)
  • Theorem 1: Circular dimensional permutation invariance (zhao2016tensor)
  • Definition 5: 2D Convolution
  • Remark 1
  • Remark 2