Table of Contents
Fetching ...

Ultimate tensorization: compressing convolutional and FC layers alike

Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov

TL;DR

This work extends tensor-train based compression from fully-connected layers to convolutional layers by reshaping convolutional kernels into higher-order tensors and applying TT-decomposition. By integrating TT-conv with prior TT-FC compression, the approach achieves substantial network size reductions (up to about 80×) with minimal accuracy loss on CIFAR-10, outperforming naive TT applications to convolutional kernels. The method supports training via standard backpropagation and is compatible with quantization, offering a pathway toward deploying compact CNNs on mobile devices. The results motivate further evaluation on larger datasets like ImageNet and-scale architectures.

Abstract

Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.

Ultimate tensorization: compressing convolutional and FC layers alike

TL;DR

This work extends tensor-train based compression from fully-connected layers to convolutional layers by reshaping convolutional kernels into higher-order tensors and applying TT-decomposition. By integrating TT-conv with prior TT-FC compression, the approach achieves substantial network size reductions (up to about 80×) with minimal accuracy loss on CIFAR-10, outperforming naive TT applications to convolutional kernels. The method supports training via standard backpropagation and is compatible with quantization, offering a pathway toward deploying compact CNNs on mobile devices. The results motivate further evaluation on larger datasets like ImageNet and-scale architectures.

Abstract

Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.

Paper Structure

This paper contains 9 sections, 9 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Reducing convolution \ref{['eq:conv']} to a matrix-by-matrix multiplication.