Structured Memory for Neural Turing Machines
Wei Zhang, Yang Yu, Bowen Zhou
TL;DR
The paper investigates how memory organization in Neural Turing Machines affects convergence and overfitting. It proposes three structured-memory architectures (NTM1-NTM3) with hidden memory and hierarchical writing to stabilize memories; experiments on copy and associative recall tasks show NTM1/NTM2 improve convergence speed and reduce outliers relative to baseline NTM, while NTM3 is less stable. Overall, memory structuring can stabilize NTMs and improve learning of long-range sequence tasks. This work demonstrates a viable path to enhance NTMs by rethinking memory layout rather than merely increasing memory capacity.
Abstract
Neural Turing Machines (NTM) contain memory component that simulates "working memory" in the brain to store and retrieve information to ease simple algorithms learning. So far, only linearly organized memory is proposed, and during experiments, we observed that the model does not always converge, and overfits easily when handling certain tasks. We think memory component is key to some faulty behaviors of NTM, and better organization of memory component could help fight those problems. In this paper, we propose several different structures of memory for NTM, and we proved in experiments that two of our proposed structured-memory NTMs could lead to better convergence, in term of speed and prediction accuracy on copy task and associative recall task as in (Graves et al. 2014).
