Capacity of the Hebbian-Hopfield network associative memory

Mihailo Stojnic

Capacity of the Hebbian-Hopfield network associative memory

Mihailo Stojnic

TL;DR

The paper analyzes the associative memory capacity of the Hebbian Hopfield network for random binary patterns, focusing on two basins of attraction (AGS and NLT) and employing the fully lifted random duality theory (fl RDT). It derives explicit first-level capacity formulas $\alpha_c^{(AGS,1)}$ and $\alpha_c^{(NLT,1)}$, with numerical values $0.137906$ and $0.129490$, and demonstrates remarkably fast convergence with second-level lifting to $\alpha_c^{(AGS,2)}\approx0.138186$ and $\alpha_c^{(NLT,2)}\approx0.12979$. The AGS results align with replica-symmetry analyses (AmiGutSom85) and symmetry-breaking results (SteKuh94), while the NLT outcomes surpass prior rigorous bounds (Newman88, Louk94, Tal98). The methodology, anchored in bilinearly indexed random processes and sfl RDT, provides a generic framework for analyzing memory capacity and can be extended to broader network architectures and basins.

Abstract

In \cite{Hop82}, Hopfield introduced a \emph{Hebbian} learning rule based neural network model and suggested how it can efficiently operate as an associative memory. Studying random binary patterns, he also uncovered that, if a small fraction of errors is tolerated in the stored patterns retrieval, the capacity of the network (maximal number of memorized patterns, $m$) scales linearly with each pattern's size, $n$. Moreover, he famously predicted $α_c=\lim_{n\rightarrow\infty}\frac{m}{n}\approx 0.14$. We study this very same scenario with two famous pattern's basins of attraction: \textbf{\emph{(i)}} The AGS one from \cite{AmiGutSom85}; and \textbf{\emph{(ii)}} The NLT one from \cite{Newman88,Louk94,Louk94a,Louk97,Tal98}. Relying on the \emph{fully lifted random duality theory} (fl RDT) from \cite{Stojnicflrdt23}, we obtain the following explicit capacity characterizations on the first level of lifting: \begin{equation} α_c^{(AGS,1)} = \left ( \max_{δ\in \left ( 0,\frac{1}{2}\right ) }\frac{1-2δ}{\sqrt{2} \mbox{erfinv} \left ( 1-2δ\right )} - \frac{2}{\sqrt{2π}} e^{-\left ( \mbox{erfinv}\left ( 1-2δ\right )\right )^2}\right )^2 \approx \mathbf{0.137906} \end{equation} \begin{equation} α_c^{(NLT,1)} = \frac{\mbox{erf}(x)^2}{2x^2}-1+\mbox{erf}(x)^2 \approx \mathbf{0.129490}, \quad 1-\mbox{erf}(x)^2- \frac{2\mbox{erf}(x)e^{-x^2}}{\sqrtπx}+\frac{2e^{-2x^2}}π=0. \end{equation} A substantial numerical work gives on the second level of lifting $α_c^{(AGS,2)} \approx \mathbf{0.138186}$ and $α_c^{(NLT,2)} \approx \mathbf{0.12979}$, effectively uncovering a remarkably fast lifting convergence. Moreover, the obtained AGS characterizations exactly match the replica symmetry based ones of \cite{AmiGutSom85} and the corresponding symmetry breaking ones of \cite{SteKuh94}.

Capacity of the Hebbian-Hopfield network associative memory

TL;DR

and

, with numerical values

and

, and demonstrates remarkably fast convergence with second-level lifting to

and

. The AGS results align with replica-symmetry analyses (AmiGutSom85) and symmetry-breaking results (SteKuh94), while the NLT outcomes surpass prior rigorous bounds (Newman88, Louk94, Tal98). The methodology, anchored in bilinearly indexed random processes and sfl RDT, provides a generic framework for analyzing memory capacity and can be extended to broader network architectures and basins.

Abstract

) scales linearly with each pattern's size,

. Moreover, he famously predicted

. We study this very same scenario with two famous pattern's basins of attraction: \textbf{\emph{(i)}} The AGS one from \cite{AmiGutSom85}; and \textbf{\emph{(ii)}} The NLT one from \cite{Newman88,Louk94,Louk94a,Louk97,Tal98}. Relying on the \emph{fully lifted random duality theory} (fl RDT) from \cite{Stojnicflrdt23}, we obtain the following explicit capacity characterizations on the first level of lifting: \begin{equation} α_c^{(AGS,1)} = \left ( \max_{δ\in \left ( 0,\frac{1}{2}\right ) }\frac{1-2δ}{\sqrt{2} \mbox{erfinv} \left ( 1-2δ\right )} - \frac{2}{\sqrt{2π}} e^{-\left ( \mbox{erfinv}\left ( 1-2δ\right )\right )^2}\right )^2 \approx \mathbf{0.137906} \end{equation} \begin{equation} α_c^{(NLT,1)} = \frac{\mbox{erf}(x)^2}{2x^2}-1+\mbox{erf}(x)^2 \approx \mathbf{0.129490}, \quad 1-\mbox{erf}(x)^2- \frac{2\mbox{erf}(x)e^{-x^2}}{\sqrtπx}+\frac{2e^{-2x^2}}π=0. \end{equation} A substantial numerical work gives on the second level of lifting

and

, effectively uncovering a remarkably fast lifting convergence. Moreover, the obtained AGS characterizations exactly match the replica symmetry based ones of \cite{AmiGutSom85} and the corresponding symmetry breaking ones of \cite{SteKuh94}.

Paper Structure (20 sections, 4 theorems, 137 equations, 6 figures, 4 tables)

This paper contains 20 sections, 4 theorems, 137 equations, 6 figures, 4 tables.

Introduction
Relevant prior work
Our contributions
Mathematical setup
AGS basin of attraction
NLT basin of attraction
Statistical associative memory capacity
Free energy correspondence
Connection to bli random processes and sfl RDT
Practical realization
Numerical evaluations -- AGS basin
$r=1$ -- first level of lifting
$r=2$ -- second level of lifting
Modulo-${\bf m}$ sfl RDT
Numerical evaluations -- NLT basin
...and 5 more sections

Key Result

Theorem 1

Stojnicflrdt23 Consider large $n$ linear regime with $\alpha\triangleq \lim_{n\rightarrow\infty} \frac{m}{n}$, remaining constant as $n$ grows. Let ${\mathcal{X}}\subseteq {\mathbb R}^n$ and ${\mathcal{Y}}\subseteq {\mathbb R}^m$ be two given sets and let the elements of $G\in{\mathbb R}^{m\times n} Let $\hat{{\bf p}_0}\rightarrow 1$, $\hat{{\bf q}_0}\rightarrow 1$, and $\hat{{\bf c}_0}\rightarrow

Figures (6)

Figure 1: $\xi_{tot}$ as a function of $\delta$; $\alpha_c^{(AGS,1)} \approx \textcolor{blue}{\mathbf{0.137905566}}$ -- maximum $\alpha$ such that the infliction point still exists on the first level of lifting
Figure 2: Alternative view of $\alpha_c^{(AGS,1)} \approx \textcolor{blue}{\mathbf{0.137905566}}$ -- maximum $\alpha$ such that the infliction point still exists on the first level of lifting
Figure 3: $\xi_{tot}$ as a function of $\delta$; $\alpha_c^{(AGS,2)} \approx \textcolor{blue}{\mathbf{0.138186}}$ -- maximum $\alpha$ such that the infliction point still exists on the second level of lifting
Figure 4: $\xi_{tot}$ as a function of $\delta$; $\alpha_c^{(NLT,1)} \approx \textcolor{blue}{\mathbf{0.1294899}}$ -- maximum $\alpha$ such that $\exists\delta\in \left ( 0,\frac{1}{2}\right )$ for which $\xi_{tot}(\delta)=\xi_{tot}(0)$ on the first level of lifting
Figure 5: $\xi_{tot}$ as a function of $\delta$; $\alpha_c^{(NLT,2)} \approx \textcolor{blue}{\mathbf{0.12979}}$ -- maximum $\alpha$ such that $\exists\delta\in \left ( 0,\frac{1}{2}\right )$ for which $\xi_{tot}(\delta)=\xi_{tot}(0)$ on the second level of lifting
...and 1 more figures

Theorems & Definitions (8)

Theorem 1
proof
Corollary 1
proof
Theorem 2
proof
Theorem 3
proof

Capacity of the Hebbian-Hopfield network associative memory

TL;DR

Abstract

Capacity of the Hebbian-Hopfield network associative memory

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (8)