A Simple Sparse Matrix Vector Multiplication Approach to Padded Convolution
Zan Chaudhry
TL;DR
The paper addresses the inefficiency of padding-aware convolution by formulating padding and the convolution as a sparse transformation using matrices P and C, enabling convolution via sparse matrix-vector multiplication (SpMV). A key theoretical contribution is Theorem 2.1, which provides an explicit expression for the number of non-zero multiplications, highlighting where sparsity reduces work. The authors implement proof-of-concept CPU and GPU versions and compare them to Conv2D on DenseNet121, showing CPU variants achieve speedups over Conv2D-C and competitiveGPU performance, particularly in fixed-kernel regimes. The work demonstrates sparsity-aware convolution as a promising direction for accelerating inference and motivates further development of sparse representations and multi-channel/batched extensions for real-time CNNs.
Abstract
We introduce an algorithm for efficiently representing convolution with zero-padding and stride as a sparse transformation matrix, applied to a vectorized input through sparse matrix-vector multiplication (SpMV). We provide a theoretical contribution with an explicit expression for the number of non-zero multiplications in convolutions with stride and padding, offering insight into the potential for leveraging sparsity in convolution operations. A proof-of-concept implementation is presented in Python, demonstrating the performance of our method on both CPU and GPU architectures. This work contributes to the broader exploration of sparse matrix techniques in convolutional algorithms, with a particular focus on leveraging matrix multiplications for parallelization. Our findings lay the groundwork for future advancements in exploiting sparsity to improve the efficiency of convolution operations in fields such as machine learning and signal processing.
