Table of Contents
Fetching ...

Prototype-based interpretation of the functionality of neurons in winner-take-all neural networks

Ramin Zarei Sabzevar, Kamaledin Ghiasi-Shirazi, Ahad Harati

TL;DR

The main finding of this article is this interpretation of the functionality of neurons as computing the difference between the distances to a positive and a negative prototype, which is in agreement with the BCM theory.

Abstract

Prototype-based learning (PbL) using a winner-take-all (WTA) network based on minimum Euclidean distance (ED-WTA) is an intuitive approach to multiclass classification. By constructing meaningful class centers, PbL provides higher interpretability and generalization than hyperplane-based learning (HbL) methods based on maximum Inner Product (IP-WTA) and can efficiently detect and reject samples that do not belong to any classes. In this paper, we first prove the equivalence of IP-WTA and ED-WTA from a representational point of view. Then, we show that naively using this equivalence leads to unintuitive ED-WTA networks in which the centers have high distances to data that they represent. We propose $\pm$ED-WTA which models each neuron with two prototypes: one positive prototype representing samples that are modeled by this neuron and a negative prototype representing the samples that are erroneously won by that neuron during training. We propose a novel training algorithm for the $\pm$ED-WTA network, which cleverly switches between updating the positive and negative prototypes and is essential to the emergence of interpretable prototypes. Unexpectedly, we observed that the negative prototype of each neuron is indistinguishably similar to the positive one. The rationale behind this observation is that the training data that are mistaken with a prototype are indeed similar to it. The main finding of this paper is this interpretation of the functionality of neurons as computing the difference between the distances to a positive and a negative prototype, which is in agreement with the BCM theory. In our experiments, we show that the proposed $\pm$ED-WTA method constructs highly interpretable prototypes that can be successfully used for detecting outlier and adversarial examples.

Prototype-based interpretation of the functionality of neurons in winner-take-all neural networks

TL;DR

The main finding of this article is this interpretation of the functionality of neurons as computing the difference between the distances to a positive and a negative prototype, which is in agreement with the BCM theory.

Abstract

Prototype-based learning (PbL) using a winner-take-all (WTA) network based on minimum Euclidean distance (ED-WTA) is an intuitive approach to multiclass classification. By constructing meaningful class centers, PbL provides higher interpretability and generalization than hyperplane-based learning (HbL) methods based on maximum Inner Product (IP-WTA) and can efficiently detect and reject samples that do not belong to any classes. In this paper, we first prove the equivalence of IP-WTA and ED-WTA from a representational point of view. Then, we show that naively using this equivalence leads to unintuitive ED-WTA networks in which the centers have high distances to data that they represent. We propose ED-WTA which models each neuron with two prototypes: one positive prototype representing samples that are modeled by this neuron and a negative prototype representing the samples that are erroneously won by that neuron during training. We propose a novel training algorithm for the ED-WTA network, which cleverly switches between updating the positive and negative prototypes and is essential to the emergence of interpretable prototypes. Unexpectedly, we observed that the negative prototype of each neuron is indistinguishably similar to the positive one. The rationale behind this observation is that the training data that are mistaken with a prototype are indeed similar to it. The main finding of this paper is this interpretation of the functionality of neurons as computing the difference between the distances to a positive and a negative prototype, which is in agreement with the BCM theory. In our experiments, we show that the proposed ED-WTA method constructs highly interpretable prototypes that can be successfully used for detecting outlier and adversarial examples.

Paper Structure

This paper contains 17 sections, 32 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: The structure of a $\pm$ED-WTA network with positive and negative prototypes for each class. Each neuron is modeled by a positive and a negative prototype center. During training, we ensure that these centers are kept close, in Euclidean norm, to the input data that the they represent. In final usage, the whole functionality of the positive and negative centers is simplified to an ordinary IP-WTA network.
  • Figure 2: Visualization of (a) trained IP-WTA weights, (b) the centers obtained by converting the weights of IP-WTA to ED-WTA using the algorithm of section \ref{['sec:iterativeAlgorithm']}, (c) the centers after dropping the biases of ED-WTA and continuing training with CCE for 200 epochs, (d) the centers obtained by the K-means algorithm. Positive and negative values are shown in green and red, respectively.
  • Figure 5: Confidence measures of some outlier samples from the ORL dataset, computed by a $\pm$ED-WTA model trained on the MNIST dataset. The confidence measures $P^{IP}$ and $P^{+ED}$ are drawn below each sample from left to right, respectively.
  • Figure 6: Acceptance rate on the MNIST test set and the rejection rate on the ORL dataset for different threshold values of $P^{+ED}$.
  • Figure 7: The process of generating adversarial examples. (a) Some samples from the MNIST test set along with a pure-noise image. (b) The positive centers of the neuron in $\pm$ED-WTA chosen as the target for generating an adversarial example. (c) The resulting adverserial examples. In (a) and (c), the probabilities $P^{IP}$ and $P^{+ED}$ are drawn from left to right below each sample. While $P^{IP}$ is high for both of the original and adversarial examples, $P^{+ED}$ is only high for the original digit images. The values of $P^{IP}$ and $P^{+ED}$ for the pure-noise image in (a) are very interesting. Since the pure-noise image is far from all hyperplanes, $P^{IP}$ confidentially accepts it as a digit. On the other hand, since the pure-noise image is far from all positive centers of $\pm$-ED-WTA, $P^{+ED}$ confidentially rejects it.
  • ...and 1 more figures