Table of Contents
Fetching ...

Hierarchical Associative Memory

Dmitry Krotov

TL;DR

This work generalizes Modern Hopfield Networks to fully recurrent, multi-layer architectures with local connectivity by leveraging a Lagrangian formalism that yields a global Lyapunov energy $E$ decreasing along trajectories. It introduces hierarchical layered HAMs with symmetric feedforward/feedback weights, enabling bottom-up and top-down information flow and ensuring convergence to fixed-point attractors. Three simple architectures are worked out (one hidden layer, two dense hidden layers, and two hidden layers with a convolutional first layer), each accompanied by explicit dynamics and energy expressions, including adiabatic time-scale simplifications. The approach broadens memory capacity and inductive bias while maintaining biological plausibility, offering pathways for end-to-end or time-unfolded training and highlighting potential extensions like lateral connections and gated units.

Abstract

Dense Associative Memories or Modern Hopfield Networks have many appealing properties of associative memory. They can do pattern completion, store a large number of memories, and can be described using a recurrent neural network with a degree of biological plausibility and rich feedback between the neurons. At the same time, up until now all the models of this class have had only one hidden layer, and have only been formulated with densely connected network architectures, two aspects that hinder their machine learning applications. This paper tackles this gap and describes a fully recurrent model of associative memory with an arbitrary large number of layers, some of which can be locally connected (convolutional), and a corresponding energy function that decreases on the dynamical trajectory of the neurons' activations. The memories of the full network are dynamically "assembled" using primitives encoded in the synaptic weights of the lower layers, with the "assembling rules" encoded in the synaptic weights of the higher layers. In addition to the bottom-up propagation of information, typical of commonly used feedforward neural networks, the model described has rich top-down feedback from higher layers that help the lower-layer neurons to decide on their response to the input stimuli.

Hierarchical Associative Memory

TL;DR

This work generalizes Modern Hopfield Networks to fully recurrent, multi-layer architectures with local connectivity by leveraging a Lagrangian formalism that yields a global Lyapunov energy decreasing along trajectories. It introduces hierarchical layered HAMs with symmetric feedforward/feedback weights, enabling bottom-up and top-down information flow and ensuring convergence to fixed-point attractors. Three simple architectures are worked out (one hidden layer, two dense hidden layers, and two hidden layers with a convolutional first layer), each accompanied by explicit dynamics and energy expressions, including adiabatic time-scale simplifications. The approach broadens memory capacity and inductive bias while maintaining biological plausibility, offering pathways for end-to-end or time-unfolded training and highlighting potential extensions like lateral connections and gated units.

Abstract

Dense Associative Memories or Modern Hopfield Networks have many appealing properties of associative memory. They can do pattern completion, store a large number of memories, and can be described using a recurrent neural network with a degree of biological plausibility and rich feedback between the neurons. At the same time, up until now all the models of this class have had only one hidden layer, and have only been formulated with densely connected network architectures, two aspects that hinder their machine learning applications. This paper tackles this gap and describes a fully recurrent model of associative memory with an arbitrary large number of layers, some of which can be locally connected (convolutional), and a corresponding energy function that decreases on the dynamical trajectory of the neurons' activations. The memories of the full network are dynamically "assembled" using primitives encoded in the synaptic weights of the lower layers, with the "assembling rules" encoded in the synaptic weights of the higher layers. In addition to the bottom-up propagation of information, typical of commonly used feedforward neural networks, the model described has rich top-down feedback from higher layers that help the lower-layer neurons to decide on their response to the input stimuli.

Paper Structure

This paper contains 9 sections, 25 equations, 2 figures.

Figures (2)

  • Figure 1: (A) The connectivity diagram of the fully-connected network. The synaptic weights are described by the symmetric matrix $W_{IJ}$. (B) The connectivity diagram of the layered network. Each layer can have a different number of neurons, different activation function, and different time scales. The feedforward weights and feedback weights are equal.
  • Figure 2: (A) Modern Hopfield Network with one hidden layer. The output of the input layer is a linear function, the output of the hidden layer is a softmax with inverse temperature $\beta$. (B) HAM network with two hidden layers. The output of the input layer is a linear function, the output of the second layer is a softmax with inverse temperature $\beta_2$, the output of the third layer is a softmax with inverse temperature $\beta_3$. (C) Convolutional HAM model with two hidden layers (convolutional and dense).