Table of Contents
Fetching ...

Kernel Memory Networks: A Unifying Framework for Memory Modeling

Georgios Iatropoulos, Johanni Brea, Wulfram Gerstner

TL;DR

The framework of kernel memory networks offers a simple and intuitive way to understand the storage capacity of previous memory models, and allows for new biological interpretations in terms of dendritic non-linearities and synaptic cross-talk.

Abstract

We consider the problem of training a neural network to store a set of patterns with maximal noise robustness. A solution, in terms of optimal weights and state update rules, is derived by training each individual neuron to perform either kernel classification or interpolation with a minimum weight norm. By applying this method to feed-forward and recurrent networks, we derive optimal models, termed kernel memory networks, that include, as special cases, many of the hetero- and auto-associative memory models that have been proposed over the past years, such as modern Hopfield networks and Kanerva's sparse distributed memory. We modify Kanerva's model and demonstrate a simple way to design a kernel memory network that can store an exponential number of continuous-valued patterns with a finite basin of attraction. The framework of kernel memory networks offers a simple and intuitive way to understand the storage capacity of previous memory models, and allows for new biological interpretations in terms of dendritic non-linearities and synaptic cross-talk.

Kernel Memory Networks: A Unifying Framework for Memory Modeling

TL;DR

The framework of kernel memory networks offers a simple and intuitive way to understand the storage capacity of previous memory models, and allows for new biological interpretations in terms of dendritic non-linearities and synaptic cross-talk.

Abstract

We consider the problem of training a neural network to store a set of patterns with maximal noise robustness. A solution, in terms of optimal weights and state update rules, is derived by training each individual neuron to perform either kernel classification or interpolation with a minimum weight norm. By applying this method to feed-forward and recurrent networks, we derive optimal models, termed kernel memory networks, that include, as special cases, many of the hetero- and auto-associative memory models that have been proposed over the past years, such as modern Hopfield networks and Kanerva's sparse distributed memory. We modify Kanerva's model and demonstrate a simple way to design a kernel memory network that can store an exponential number of continuous-valued patterns with a finite basin of attraction. The framework of kernel memory networks offers a simple and intuitive way to understand the storage capacity of previous memory models, and allows for new biological interpretations in terms of dendritic non-linearities and synaptic cross-talk.
Paper Structure (39 sections, 64 equations, 4 figures)

This paper contains 39 sections, 64 equations, 4 figures.

Figures (4)

  • Figure 1: Graphical representation of (A1-A2) the feed-forward SVM network, (A3) the SDM, (B1-B2) the recurrent SVM network, and (C) an SVM mapped to the anatomy of a pyramidal cell (see Sec. \ref{['sec:neurons']}). Circles represent neurons, while boxes represent the input transformation by the feature map $\bm{\phi}$, which can be dependent (A1, B1) or independent (A2, B2) of neuron index $i$.
  • Figure 2: Plot of random patterns on $\mathbb{S}^2$ together with attractor basins at $\beta \to \infty$. Dots represent patterns ($M=17$) while thick and thin red lines correspond to the boundaries of the attractor basins according to the Exp$_\beta$ network and the softmax network, respectively. The radius of the circular boundaries has been set to half the minimum pairwise distance between the patterns.
  • Figure B.1: Plot of the kernel of an infinite SDM on the hypersphere, $K_\mathrm{SDM}(\mathbf{x}_i, \mathbf{x}_j)$, as a function of (A) the angle between $\mathbf{x}_i$ and $\mathbf{x}_j$, and (B) the bias $b$. Solid lines represent the exact solution in Eq. \ref{['eq:sdm_kernel_sphere_apx']}, and dashed lines the approximation in Eq. \ref{['eq:sdm_kernel_sphere_approx_apx']}. Parameter values: (A) $N_\mathrm{in} = 50$; (B) $\arccos(\mathbf{x}_i^\top \mathbf{x}_j) = \arccos(b)$.
  • Figure C.1: Plot of the noise tolerance $\gamma$ (mean $\pm$ s.e.m. over 20 simulations) as a function of the storage load for a single SVM neuron with $N=10^2$ inputs, trained with the stochastic batch perceptron (SBP) and the modern Hopfield rule. The SBP uses the feature map $\bm{\phi}_\mathrm{pairs}$, while the modern Hopfield rule is applied to both $\bm{\phi}_\mathrm{pairs}$ and $\bm{\phi}_\mathrm{poly2}$, corresponding to the kernel $K(\mathbf{x}_i, \mathbf{x}_j) = (\mathbf{x}_i^\top \mathbf{x}_j)^2$. SBP hyperparameters: learning rate $= 10^{-5}$, iterations $= 20 M$.

Theorems & Definitions (4)

  • Remark
  • Remark
  • Remark
  • Remark