Table of Contents
Fetching ...

Inducing Semi-Structured Sparsity by Masking for Efficient Model Inference in Convolutional Networks

David A. Danhofer

TL;DR

This paper proposes a novel method to learn semi-structured sparsity patterns for convolution kernels in the form of maskings enabling the utilization of readily available hardware accelerations that accelerates convolutional models more than two-fold during inference without decreasing model performance.

Abstract

The crucial role of convolutional models, both as standalone vision models and backbones in foundation models, necessitates effective acceleration techniques. This paper proposes a novel method to learn semi-structured sparsity patterns for convolution kernels in the form of maskings enabling the utilization of readily available hardware accelerations. The approach accelerates convolutional models more than two-fold during inference without decreasing model performance. At the same time, the original model weights and structure remain unchanged keeping the model thus easily updatable. Beyond the immediate practical use, the effect of maskings on prediction is easily quantifiable. Therefore, guarantees on model predictions under maskings are derived showing stability bounds for learned maskings even after updating the original underlying model.

Inducing Semi-Structured Sparsity by Masking for Efficient Model Inference in Convolutional Networks

TL;DR

This paper proposes a novel method to learn semi-structured sparsity patterns for convolution kernels in the form of maskings enabling the utilization of readily available hardware accelerations that accelerates convolutional models more than two-fold during inference without decreasing model performance.

Abstract

The crucial role of convolutional models, both as standalone vision models and backbones in foundation models, necessitates effective acceleration techniques. This paper proposes a novel method to learn semi-structured sparsity patterns for convolution kernels in the form of maskings enabling the utilization of readily available hardware accelerations. The approach accelerates convolutional models more than two-fold during inference without decreasing model performance. At the same time, the original model weights and structure remain unchanged keeping the model thus easily updatable. Beyond the immediate practical use, the effect of maskings on prediction is easily quantifiable. Therefore, guarantees on model predictions under maskings are derived showing stability bounds for learned maskings even after updating the original underlying model.

Paper Structure

This paper contains 21 sections, 6 theorems, 21 equations, 3 figures, 4 tables.

Key Result

Lemma 3.1

Let $f(x) = (\text{softmax} \circ f_d \circ ... \circ f_1)(x)$ be a compositional classifier of depth $d$ with $f_i(x) = \sigma(W_ix + b_i)$ predicting the class probabilities of an input sample $x \in X$ across $c$ classes. Let $\sigma$ be a non-linear element-wise activation function and $L$-Lipsc

Figures (3)

  • Figure 1: A 2:4 sparse matrix of floating point values -- obtained from a dense matrix and a sparse bit mask -- and its equivalent structured representation containing only the non-zero entries and a 2-bit index preserving the structure taking up roughly only half the space.
  • Figure 2: Different levels of granularity in a 4D-tensor as used in 2D-convolutions of multi-channel inputs with several filters; although the same number of weights is retained and pruned across all levels of structure the ease of processing increases as structure increases but limits the number of possible patterns at the same time.
  • Figure 3: Simplified visualizations of convolutions on single channel input $X$ of unspecified width and height and a single filter $H$ as (a) "standard" convolution with a moving filter and (b) as a matrix product between an unfolded input $\tilde{X}$ and a weight matrix $W$ derived from the filter

Theorems & Definitions (11)

  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • Lemma 3.5
  • Lemma 3.6
  • proof
  • proof
  • proof
  • proof
  • ...and 1 more