Factorizers for Distributed Sparse Block Codes

Michael Hersche; Aleksandar Terzic; Geethan Karunaratne; Jovin Langenegger; Angéline Pouget; Giovanni Cherubini; Luca Benini; Abu Sebastian; Abbas Rahimi

Factorizers for Distributed Sparse Block Codes

Michael Hersche, Aleksandar Terzic, Geethan Karunaratne, Jovin Langenegger, Angéline Pouget, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi

TL;DR

This work proposes a fast and highly accurate method for factorizing a more flexible and hence generalized form of SBCs, dubbed GSBCs, and provides a methodology to flexibly integrate the factorizer in the classification layer of CNNs with a novel loss function.

Abstract

Distributed sparse block codes (SBCs) exhibit compact representations for encoding and manipulating symbolic data structures using fixed-width vectors. One major challenge however is to disentangle, or factorize, the distributed representation of data structures into their constituent elements without having to search through all possible combinations. This factorization becomes more challenging when SBCs vectors are noisy due to perceptual uncertainty and approximations made by modern neural networks to generate the query SBCs vectors. To address these challenges, we first propose a fast and highly accurate method for factorizing a more flexible and hence generalized form of SBCs, dubbed GSBCs. Our iterative factorizer introduces a threshold-based nonlinear activation, conditional random sampling, and an $\ell_\infty$-based similarity metric. Secondly, the proposed factorizer maintains a high accuracy when queried by noisy product vectors generated using deep convolutional neural networks (CNNs). This facilitates its application in replacing the large fully connected layer (FCL) in CNNs, whereby $C$ trainable class vectors, or attribute combinations, can be implicitly represented by our factorizer having $F$-factor codebooks, each with $\sqrt[\leftroot{-2}\uproot{2}F]{C}$ fixed codevectors. We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function. With this integration, the convolutional layers can generate a noisy product vector that our factorizer can still decode, whereby the decoded factors can have different interpretations based on downstream tasks. We demonstrate the feasibility of our method on four deep CNN architectures over CIFAR-100, ImageNet-1K, and RAVEN datasets. In all use cases, the number of parameters and operations are notably reduced compared to the FCL.

Factorizers for Distributed Sparse Block Codes

TL;DR

Abstract

-based similarity metric. Secondly, the proposed factorizer maintains a high accuracy when queried by noisy product vectors generated using deep convolutional neural networks (CNNs). This facilitates its application in replacing the large fully connected layer (FCL) in CNNs, whereby

trainable class vectors, or attribute combinations, can be implicitly represented by our factorizer having

-factor codebooks, each with

fixed codevectors. We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function. With this integration, the convolutional layers can generate a noisy product vector that our factorizer can still decode, whereby the decoded factors can have different interpretations based on downstream tasks. We demonstrate the feasibility of our method on four deep CNN architectures over CIFAR-100, ImageNet-1K, and RAVEN datasets. In all use cases, the number of parameters and operations are notably reduced compared to the FCL.

Paper Structure (42 sections, 10 equations, 9 figures, 12 tables)

This paper contains 42 sections, 10 equations, 9 figures, 12 tables.

Introduction
VSA Preliminary
Related Work
Factorizing distributed representations
Fixing the final FCL in CNNs
Part I: Factorization of Generalized Sparse Block Codes
Generalized sparse block codes (GSBCs)
Binding/Unbinding
Bundling
$\ell_\infty$-based Similarity
Factorization problem
Block code factorizer (BCF)
Hyperparameter optimization
Experimental setup
Comparative results
...and 27 more sections

Figures (9)

Figure 1: Block code factorizer (BCF) for $F=2$ factors. It can factorize both synthetic binary SBC product vectors and product vectors ($\mathbf{p}$) which might result from a neural network mapping.
Figure 2: Threshold and sampling width of Bayesian optimization for $D_p=512$, $B=4$, and $F=2$.
Figure 3: Factorization accuracy (left) and number of iterations (right) of various BCF configurations on synthetic (i.e., exact) product vectors for different problem sizes ($\prod_{f=1}^{F} M_f$). We set $D_p=512$, $F=2$, and $B=4$. The maximum operational capacity is marked with a cross. Problem sizes exceeding the operational capacity are marked with dashed lines which face an accuracy lower than 99%. BCF configured with binary SBC operations (in blue) cannot solve any of the displayed problem sizes at the required accuracy.
Figure 4: Effect of the dimension $D_p$ on the number of iterations for BCF with $B=4$, $F=2$ (left) and $F=3$ (right).
Figure 5: Effect of the number of blocks $B$ on the number of iterations for BCF with $D=512$, $F=2$ (left) and $F=3$ (right).
...and 4 more figures

Factorizers for Distributed Sparse Block Codes

TL;DR

Abstract

Factorizers for Distributed Sparse Block Codes

Authors

TL;DR

Abstract

Table of Contents

Figures (9)