Modern Hopfield Networks with Continuous-Time Memories
Saul Santos, António Farinhas, Daniel C. McNamee, André F. T. Martins
TL;DR
The paper tackles scalable memory in modern Hopfield networks by introducing continuous-time memories that compress large discrete memories into smooth, continuous representations. It derives an energy-based CTM-HN with a Gibbs-density update over a continuous signal, linking attractor dynamics to neural resource allocation and continuous-attention concepts. Empirical results on synthetic tasks and video data show retrieval performance comparable to discrete HNs while using fewer memory resources, with notable gains in continuous embeddings and when memory length is large. This approach offers a principled, resource-efficient path toward scalable memory-augmented models and broadens the connection between Hopfield dynamics, continuous attention, and transformer-inspired architectures.
Abstract
Recent research has established a connection between modern Hopfield networks (HNs) and transformer attention heads, with guarantees of exponential storage capacity. However, these models still face challenges scaling storage efficiently. Inspired by psychological theories of continuous neural resource allocation in working memory, we propose an approach that compresses large discrete Hopfield memories into smaller, continuous-time memories. Leveraging continuous attention, our new energy function modifies the update rule of HNs, replacing the traditional softmax-based probability mass function with a probability density, over the continuous memory. This formulation aligns with modern perspectives on human executive function, offering a principled link between attractor dynamics in working memory and resource-efficient memory allocation. Our framework maintains competitive performance with HNs while leveraging a compressed memory, reducing computational costs across synthetic and video datasets.
