Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

Benjamin Husson; Mohammed Belcaïd; Thomas Carle; Claire Pagetti

Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

Benjamin Husson, Mohammed Belcaïd, Thomas Carle, Claire Pagetti

Abstract

Convolutional neural networks (CNNs) require a large number of multiply-accumulate (MAC) operations. To meet real-time constraints, they often need to be executed on specialized accelerators composed of an on-chip memory and a processing unit. However, the on-chip memory is often insufficient to store all the data required to compute a CNN layer. Thus, the computation must be performed in several offloading steps. We formalise such sequences of steps and apply our formalism to a state of the art decomposition of convolutions. In order to find optimal strategies in terms of duration, we encode the problem with a set of constraints. A Python-based simulator allows to analyse in-depth computed strategies.

Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

Abstract

Paper Structure (34 sections, 22 equations, 13 figures, 1 table)

This paper contains 34 sections, 22 equations, 13 figures, 1 table.

Introduction
Problem statement
Contributions
Applicability of the proposed approach
Outline
System model -- proposed formalism
Platform model
Application model: Steps and actions
Assumptions
Representing the slices for convolutions
Reminder on convolutions
Slicing of convolution
Strategy formalization
S1-baseline formalization
S1 formalization
...and 19 more sections

Figures (13)

Figure 1: Generic accelerator architecture
Figure 2: A step = sequence of execution
Figure 3: Multi-core with local SPM (e.g. AURIX)
Figure 4: Eyeriss architecture
Figure 5: TMMA architecture
...and 8 more figures

Theorems & Definitions (25)

Definition 1: n-step computation
Definition 2: Semantics of a n-step computation
Definition 3: Duration of an n-step strategy
Definition 4: Tensor
Definition 5: 2D convolution operation
Remark 1
Definition 6: 3D-Input tensor
Definition 7: Kernels
Definition 8: 3D-Output tensor
Remark 2
...and 15 more

Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

Abstract

Convolutions Predictable Offloading to an Accelerator: Formalization and Optimization

Authors

Abstract

Table of Contents

Figures (13)

Theorems & Definitions (25)