Flattened Convolutional Neural Networks for Feedforward Acceleration
Jonghoon Jin, Aysegul Dundar, Eugenio Culurciello
TL;DR
Problem: standard CNNs are parameter- and compute-heavy for real-time tasks. Method: train flattened CNNs by decomposing 3D filters into sequences of 1D convolutions across channels, vertical, and horizontal directions (LVH), using two cascaded LVH stages to preserve performance. Findings: achieves similar or better accuracy with 8–10x fewer parameters and about a 2x speedup in feedforward execution, without post-training tuning. Impact: provides a simple, scalable approach to accelerate CNN inference and reduce memory, with potential applicability to large-scale models.
Abstract
We present flattened convolutional neural networks that are designed for fast feedforward execution. The redundancy of the parameters, especially weights of the convolutional filters in convolutional neural networks has been extensively studied and different heuristics have been proposed to construct a low rank basis of the filters after training. In this work, we train flattened networks that consist of consecutive sequence of one-dimensional filters across all directions in 3D space to obtain comparable performance as conventional convolutional networks. We tested flattened model on different datasets and found that the flattened layer can effectively substitute for the 3D filters without loss of accuracy. The flattened convolution pipelines provide around two times speed-up during feedforward pass compared to the baseline model due to the significant reduction of learning parameters. Furthermore, the proposed method does not require efforts in manual tuning or post processing once the model is trained.
