The Exponential Capacity of Dense Associative Memories
Carlo Lucibello, Marc Mézard
TL;DR
The paper investigates Dense Associative Memories (DAMs) with exponential storage, $P=e^{\alpha N}$, and derives exact retrieval criteria using a Random Energy Model (REM) framework. It defines the typical-pattern retrieval threshold $\alpha_1(\lambda)$ and a lower bound for all-pattern retrieval $\alpha_c(\lambda)$, showing that for spherical patterns these thresholds coincide, while Gaussian patterns exhibit a gap and a condensation transition at $\lambda_*(\alpha,\rho)$. By analyzing basins of attraction, it characterizes how random initial conditions converge to retrieved patterns and highlights geometric distinctions between pattern ensembles. A scaled dot-product regime relevant to Transformer attention reveals a regime where the single-pattern and all-patterns thresholds merge, emphasizing the connection between DAMs and attention mechanisms. The results illuminate how exponential memory capacity emerges in high-dimensional settings and point to future directions for rigorous proofs and extensions to finite temperature and other pattern distributions.
Abstract
Recent generalizations of the Hopfield model of associative memories are able to store a number $P$ of random patterns that grows exponentially with the number $N$ of neurons, $P=\exp(αN)$. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architectures widely applied in deep learning. In this work, we study a generic family of pattern ensembles using a statistical mechanics analysis which gives exact asymptotic thresholds for the retrieval of a typical pattern, $α_1$, and lower bounds for the maximum of the load $α$ for which all patterns can be retrieved, $α_c$, as well as sizes of attraction basins. We discuss in detail the cases of Gaussian and spherical patterns, and show that they display rich and qualitatively different phase diagrams.
