Sequential Learning in the Dense Associative Memory
Hayden McAlister, Anthony Robins, Lech Szymanski
TL;DR
This paper investigates sequential learning in Dense Associative Memory DAM, a modern Hopfield network with memory vectors and an interaction vertex n. It benchmarks a range of sequential learning techniques including naive rehearsal, pseudorehearsal, GEM, A-GEM, and several regularization-based methods on five permuted MNIST tasks, revealing DAM specific transitions as n varies. The findings show strong effectiveness of rehearsal-based approaches, notable instability of gradient-based methods at intermediate vertices, and nuanced performance of regularization strategies that depend on data size and memory regime. These results establish a foundation for understanding DAM behavior under sequential learning and suggest directions for extending DAM to continuous domains and exploring its attractor dynamics under task sequences.
Abstract
Sequential learning involves learning tasks in a sequence, and proves challenging for most neural networks. Biological neural networks regularly conquer the sequential learning challenge and are even capable of transferring knowledge both forward and backwards between tasks. Artificial neural networks often totally fail to transfer performance between tasks, and regularly suffer from degraded performance or catastrophic forgetting on previous tasks. Models of associative memory have been used to investigate the discrepancy between biological and artificial neural networks due to their biological ties and inspirations, of which the Hopfield network is the most studied model. The Dense Associative Memory (DAM), or modern Hopfield network, generalizes the Hopfield network, allowing for greater capacities and prototype learning behaviors, while still retaining the associative memory structure. We give a substantial review of the sequential learning space with particular respect to the Hopfield network and associative memories. We perform foundational benchmarks of sequential learning in the DAM using various sequential learning techniques, and analyze the results of the sequential learning to demonstrate previously unseen transitions in the behavior of the DAM. This paper also discusses the departure from biological plausibility that may affect the utility of the DAM as a tool for studying biological neural networks. We present our findings, including the effectiveness of a range of state-of-the-art sequential learning methods when applied to the DAM, and use these methods to further the understanding of DAM properties and behaviors.
