Dense Associative Memory with Epanechnikov Energy
Benjamin Hoover, Zhaoyang Shi, Krishnakumar Balasubramanian, Dmitry Krotov, Parikshit Ram
TL;DR
This work addresses the trade-off between memorization and generalization in Dense Associative Memories by introducing a KDE-inspired energy based on the Epanechnikov kernel, dubbed log-sum-ReLU (LSR). The LSR energy $E^{\text{LSR}}_\beta(\mathbf{x};\boldsymbol{\Xi})=-\frac{1}{\beta}\log\big(\epsilon+\sum_{\mu=1}^M \operatorname{ReLU}(1-\tfrac{\beta}{2}\|\mathbf{x}-\boldsymbol{\xi}_\mu\|^2)\big)$ enables exact retrieval of exponentially many memories and, crucially, the emergence of many novel local minima (emergent memories) without sacrificing recall. The paper provides theoretical guarantees (retrieval and emergent-memory counts) and validates the approach through synthetic landscapes and real-data latent spaces, showing that emergent memories can be plausibly meaningful and diverse, with log-likelihood comparable to LSE-based methods. This points to a new class of memory-rich, generative DenseAMs with potential for large-scale storage and latent-space generation, while also outlining practical limitations and future directions such as hybrid energy formulations and kernel-family exploration.
Abstract
We propose a novel energy function for Dense Associative Memory (DenseAM) networks, the log-sum-ReLU (LSR), inspired by optimal kernel density estimation. Unlike the common log-sum-exponential (LSE) function, LSR is based on the Epanechnikov kernel and enables exact memory retrieval with exponential capacity without requiring exponential separation functions. Moreover, it introduces abundant additional \emph{emergent} local minima while preserving perfect pattern recovery -- a characteristic previously unseen in DenseAM literature. Empirical results show that LSR energy has significantly more local minima (memories) that have comparable log-likelihood to LSE-based models. Analysis of LSR's emergent memories on image datasets reveals a degree of creativity and novelty, hinting at this method's potential for both large-scale memory storage and generative tasks.
