Algebraic Representations for Faster Predictions in Convolutional Neural Networks
Johnny Joyce, Jan Verschelde
TL;DR
This work develops algebraic representations for CNNs with skip connections to enable fast prediction-time inference. For linear CNNs, the authors prove that an arbitrarily deep, skip-connected network can be pre-computed into an affine map $f(X)=W X + B$, yielding substantial speedups by reducing inference to a single-layer-like operation. Extending to nonlinear networks, they introduce a homotopy approach that gradually removes skip connections during training, achieving measurable prediction-time gains (e.g., up to 22%–46% speedups in experiments) while preserving accuracy. Applied to ResNet34 on MNIST, these results demonstrate practical pathways to combine deep expressive models with the computational efficiency of shallow predictors, with future work aiming to unify linear and nonlinear results via algebraic-geometric tools and optimized scheduling of skip-connection strength.
Abstract
Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision. When CNNs are made with many layers, resulting in a deep neural network, skip connections may be added to create an easier gradient optimization problem while retaining model expressiveness. In this paper, we show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model, resulting in greatly reduced computational requirements during prediction time. We also present a method for training nonlinear models with skip connections that are gradually removed throughout training, giving the benefits of skip connections without requiring computational overhead during during prediction time. These results are demonstrated with practical examples on Residual Networks (ResNet) architecture.
