Table of Contents
Fetching ...

SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model

Grzegorz Stefański, Paweł Daniluk, Artur Szumaczuk, Jakub Tkaczuk

TL;DR

This work presents a novel method called Scattered Online Inference (SOI) that aims to reduce the computational complexity of ANNs by applying compression, which generates more general inner partial states of ANN, allowing skipping full model recalculation at each inference.

Abstract

Consumer electronics used to follow the miniaturization trend described by Moore's Law. Despite increased processing power in Microcontroller Units (MCUs), MCUs used in the smallest appliances are still not capable of running even moderately big, state-of-the-art artificial neural networks (ANNs) especially in time-sensitive scenarios. In this work, we present a novel method called Scattered Online Inference (SOI) that aims to reduce the computational complexity of ANNs. SOI leverages the continuity and seasonality of time-series data and model predictions, enabling extrapolation for processing speed improvements, particularly in deeper layers. By applying compression, SOI generates more general inner partial states of ANN, allowing skipping full model recalculation at each inference.

SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model

TL;DR

This work presents a novel method called Scattered Online Inference (SOI) that aims to reduce the computational complexity of ANNs by applying compression, which generates more general inner partial states of ANN, allowing skipping full model recalculation at each inference.

Abstract

Consumer electronics used to follow the miniaturization trend described by Moore's Law. Despite increased processing power in Microcontroller Units (MCUs), MCUs used in the smallest appliances are still not capable of running even moderately big, state-of-the-art artificial neural networks (ANNs) especially in time-sensitive scenarios. In this work, we present a novel method called Scattered Online Inference (SOI) that aims to reduce the computational complexity of ANNs. SOI leverages the continuity and seasonality of time-series data and model predictions, enabling extrapolation for processing speed improvements, particularly in deeper layers. By applying compression, SOI generates more general inner partial states of ANN, allowing skipping full model recalculation at each inference.
Paper Structure (33 sections, 7 equations, 11 figures, 14 tables)

This paper contains 33 sections, 7 equations, 11 figures, 14 tables.

Figures (11)

  • Figure 1: SOI for convolutional operations. For visualization purposes we show data as frames in time domain. A) Standard convolution. B) Strided convolution. C) Strided-Cloned Convolution. D) Shifted convolution. E) Shifted Strided-Cloned Convolution.
  • Figure 2: Inference patterns of each type of SOI based on U-Net architecture. A) Unmodified causal U-Net. B) Partially predictive (PP) SOI. C) Even inference of PP. D) Odd inference of PP. E) Fully predictive (FP) SOI. F) Even inference of FP. G) Odd inference of FP.
  • Figure 3: SOI PP inference pattern.
  • Figure 4: Results of speech separation experiment with PP SOI.
  • Figure 5: Results of speech separation experiment with FP SOI.
  • ...and 6 more figures