Table of Contents
Fetching ...

Flattened Convolutional Neural Networks for Feedforward Acceleration

Jonghoon Jin, Aysegul Dundar, Eugenio Culurciello

TL;DR

Problem: standard CNNs are parameter- and compute-heavy for real-time tasks. Method: train flattened CNNs by decomposing 3D filters into sequences of 1D convolutions across channels, vertical, and horizontal directions (LVH), using two cascaded LVH stages to preserve performance. Findings: achieves similar or better accuracy with 8–10x fewer parameters and about a 2x speedup in feedforward execution, without post-training tuning. Impact: provides a simple, scalable approach to accelerate CNN inference and reduce memory, with potential applicability to large-scale models.

Abstract

We present flattened convolutional neural networks that are designed for fast feedforward execution. The redundancy of the parameters, especially weights of the convolutional filters in convolutional neural networks has been extensively studied and different heuristics have been proposed to construct a low rank basis of the filters after training. In this work, we train flattened networks that consist of consecutive sequence of one-dimensional filters across all directions in 3D space to obtain comparable performance as conventional convolutional networks. We tested flattened model on different datasets and found that the flattened layer can effectively substitute for the 3D filters without loss of accuracy. The flattened convolution pipelines provide around two times speed-up during feedforward pass compared to the baseline model due to the significant reduction of learning parameters. Furthermore, the proposed method does not require efforts in manual tuning or post processing once the model is trained.

Flattened Convolutional Neural Networks for Feedforward Acceleration

TL;DR

Problem: standard CNNs are parameter- and compute-heavy for real-time tasks. Method: train flattened CNNs by decomposing 3D filters into sequences of 1D convolutions across channels, vertical, and horizontal directions (LVH), using two cascaded LVH stages to preserve performance. Findings: achieves similar or better accuracy with 8–10x fewer parameters and about a 2x speedup in feedforward execution, without post-training tuning. Impact: provides a simple, scalable approach to accelerate CNN inference and reduce memory, with potential applicability to large-scale models.

Abstract

We present flattened convolutional neural networks that are designed for fast feedforward execution. The redundancy of the parameters, especially weights of the convolutional filters in convolutional neural networks has been extensively studied and different heuristics have been proposed to construct a low rank basis of the filters after training. In this work, we train flattened networks that consist of consecutive sequence of one-dimensional filters across all directions in 3D space to obtain comparable performance as conventional convolutional networks. We tested flattened model on different datasets and found that the flattened layer can effectively substitute for the 3D filters without loss of accuracy. The flattened convolution pipelines provide around two times speed-up during feedforward pass compared to the baseline model due to the significant reduction of learning parameters. Furthermore, the proposed method does not require efforts in manual tuning or post processing once the model is trained.

Paper Structure

This paper contains 11 sections, 4 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The concept of 3D filter separation under rank-one assumption in the context of CNNs. Summation over all planes convolved with 2D filters produces a single output plane, which can be considered as 3D convolution. Three consecutive 1D filtering is an equivalent representation of 3D filter if its rank is one. $C$ is the number of planes and its value is $3$ in the diagram. $Y$ and $X$ denote filter height and width respectively. Bias is not considered here for simplicity.
  • Figure 2: A single layer structure of flattened convolutional networks. Flattened layer includes $l$ sets of 1D-separated convolutions over channels (lateral, $L$), vertical ($V$) and horizontal ($H$) direction. In this work, two stages of $LVH$ combinations ($l=2$) were chosen by cross-validation and it reported the same accuracy as measured in baseline model. $V$ and $H$ convolutions are operated in full mode to preserve the same output dimension as the baseline model. Bias is added after each of 1D convolution, but skipped in this illustration. No non-linear operator is applied within the flattened layer.
  • Figure 3: Convergence rate of flattened and baseline models both in training and testing. The structure of baseline and flattened model is specified in the section \ref{['sec:train-baseline']} and the table \ref{['table:model-spec-cifar10']}, respectively, and CIFAR-10 dataset is used in this experiment. The solid line denotes a mean and the shade around the line indicates a standard deviation of the curve. The variation of training for the flattened model is too small to appear in the illustration.
  • Figure 4: Visualization of the first layer filters trained on CIFAR-10. Filters are reconstructed from cross-product of 1D convolution filters and contains clear and diverse features. Filters are sorted by variance in descending order and bias is excluded during reconstruction.
  • Figure 5: Filter dimensions of $X=Y=5$ and $C = 128$ are used to observe reduction efficiency.
  • ...and 2 more figures