Table of Contents
Fetching ...

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks

Yang Sui, Miao Yin, Yu Gong, Jinqi Xiao, Huy Phan, Bo Yuan

TL;DR

By identifying the proper low-rank format and performance-improving strategy, ELRT, an efficient low-rank training solution for high-accuracy, high-compactness, low-rank CNN models is proposed.

Abstract

Low-rank compression, a popular model compression technique that produces compact convolutional neural networks (CNNs) with low rankness, has been well-studied in the literature. On the other hand, low-rank training, as an alternative way to train low-rank CNNs from scratch, has been exploited little yet. Unlike low-rank compression, low-rank training does not need pre-trained full-rank models, and the entire training phase is always performed on the low-rank structure, bringing attractive benefits for practical applications. However, the existing low-rank training solutions still face several challenges, such as a considerable accuracy drop and/or still needing to update full-size models during the training. In this paper, we perform a systematic investigation on low-rank CNN training. By identifying the proper low-rank format and performance-improving strategy, we propose ELRT, an efficient low-rank training solution for high-accuracy, high-compactness, low-rank CNN models. Our extensive evaluation results for training various CNNs on different datasets demonstrate the effectiveness of ELRT.

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks

TL;DR

By identifying the proper low-rank format and performance-improving strategy, ELRT, an efficient low-rank training solution for high-accuracy, high-compactness, low-rank CNN models is proposed.

Abstract

Low-rank compression, a popular model compression technique that produces compact convolutional neural networks (CNNs) with low rankness, has been well-studied in the literature. On the other hand, low-rank training, as an alternative way to train low-rank CNNs from scratch, has been exploited little yet. Unlike low-rank compression, low-rank training does not need pre-trained full-rank models, and the entire training phase is always performed on the low-rank structure, bringing attractive benefits for practical applications. However, the existing low-rank training solutions still face several challenges, such as a considerable accuracy drop and/or still needing to update full-size models during the training. In this paper, we perform a systematic investigation on low-rank CNN training. By identifying the proper low-rank format and performance-improving strategy, we propose ELRT, an efficient low-rank training solution for high-accuracy, high-compactness, low-rank CNN models. Our extensive evaluation results for training various CNNs on different datasets demonstrate the effectiveness of ELRT.
Paper Structure (22 sections, 7 equations, 8 figures, 22 tables, 1 algorithm)

This paper contains 22 sections, 7 equations, 8 figures, 22 tables, 1 algorithm.

Figures (8)

  • Figure 1: Different paths towards producing low-rank CNN models.
  • Figure 2: A low-rank CONV layer can either exhibit low-matrix-rankness (Top) or low-tensor-rankness (Bottom).
  • Figure 3: Approximation error (Mean Square Error (MSE)) of low-matrix-rank and low-tensor-rank methods for approximating ResNet-20 layers. Notice that MSE measurement is our analysis and exploration to identify the suitable low-rank format. It is not actually executed during training.
  • Figure 4: Training loss (left) and test accuracy (right) for low-tensor-rank ResNet-20 on CIFAR-10 with/without SO regularization. The same ranks are used for different experiments. Ranks are selected to provide 2$\times$ FLOPs reduction.
  • Figure 5: The mechanism of different approaches to impose orthogonality on $\textbf{U}^{(2)}$. From top to bottom: (a) Soft Orthogonal Regularization, (b) Double Soft Orthogonal Regularization, (c) Spectral Restricted Isometry Property Regularization, (d) Mutual Coherence Regularization.
  • ...and 3 more figures