Table of Contents
Fetching ...

Using predefined vector systems to speed up neural network multimillion class classification

Nikita Gabdullin, Ilya Androsov

Abstract

Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can be significantly reduced. This is achieved by associating label prediction with the O(1) complexity closest cluster center search in a vector system used as target for latent space configuration (LSC). The proposed method only requires finding indexes of several largest and lowest values in the embedding vector making it extremely computationally efficient. We show that the proposed method does not change NN training accuracy computational results. We also measure the time required by different computational stages of NN inference and label prediction on multiple datasets. The experiments show that the proposed method allows to achieve up to 11.6 times overall acceleration over conventional methods. Furthermore, the proposed method has unique properties which allow to predict the existence of new classes.

Using predefined vector systems to speed up neural network multimillion class classification

Abstract

Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can be significantly reduced. This is achieved by associating label prediction with the O(1) complexity closest cluster center search in a vector system used as target for latent space configuration (LSC). The proposed method only requires finding indexes of several largest and lowest values in the embedding vector making it extremely computationally efficient. We show that the proposed method does not change NN training accuracy computational results. We also measure the time required by different computational stages of NN inference and label prediction on multiple datasets. The experiments show that the proposed method allows to achieve up to 11.6 times overall acceleration over conventional methods. Furthermore, the proposed method has unique properties which allow to predict the existence of new classes.

Paper Structure

This paper contains 13 sections, 2 theorems, 18 equations, 2 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Let $V^{mk}_n$ be the set of all vectors in $\mathbb{R}^n$ having exactly $m$ entries equal to $1$, $k$ entries equal to $-1$, and $n-m-k \ge 0$ zero entries. Let $W^{mk}_n$ be the set of all vectors $x \in \mathbb{R}^n$ such that, after rearranging the coordinates in the non-decreasing order, one has Define a mapping $f : W^{mk}_n \to V^{mk}_n$ by assigning to each $w$ the vector $f(w)$ is obta

Figures (2)

  • Figure 1: (a) The total time required to obtain labels for 1m images depending on nclasses with cossim and the proposed method (dashed line indicates where batch size becomes too small for cossim to be applicable), and (b) the total acceleration coefficient Kt as function of nclasses (see Table \ref{['tab:321']}).
  • Figure 2: A "toy" example of the dog-only detection system receiving cat queries: red arrows represent labeled centers, black arrow represents the unlabeled center, red digits represent data labels, black roman numerals represent center indexes, dog images are example classes 1-3, cat images correspond to example queries q1-q3. Blue lines correspond to decision boundaries, or angles where closest centers change.

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Lemma 1
  • proof