Ultimate tensorization: compressing convolutional and FC layers alike
Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov
TL;DR
This work extends tensor-train based compression from fully-connected layers to convolutional layers by reshaping convolutional kernels into higher-order tensors and applying TT-decomposition. By integrating TT-conv with prior TT-FC compression, the approach achieves substantial network size reductions (up to about 80×) with minimal accuracy loss on CIFAR-10, outperforming naive TT applications to convolutional kernels. The method supports training via standard backpropagation and is compatible with quantization, offering a pathway toward deploying compact CNNs on mobile devices. The results motivate further evaluation on larger datasets like ImageNet and-scale architectures.
Abstract
Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.
