Table of Contents
Fetching ...

Learning Numeracy: Binary Arithmetic with Neural Turing Machines

Jacopo Castellini

TL;DR

This work evaluates Neural Turing Machines (NTMs) on two fundamental algorithmic tasks—binary addition and multiplication—to assess their ability to learn and generalize sequence-based computations using differentiable external memory. By comparing feedforward and LSTM controllers against a strong LSTM baseline, the study shows that certain NTM variants, particularly FF-NTM, can generalize addition better than the baseline, while multiplication remains difficult for all configurations. The results highlight both the potential and limits of NTMs for learning algorithms, and suggest future work to analyze learned strategies and enhance memory addressing mechanisms. Overall, the paper demonstrates that differentiable memory can support algorithmic reasoning in neural models, with implications for scalable, generalizable sequence processing.

Abstract

One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via stochastic gradient descent. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use NTMs to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the real capabilities on this neural model.

Learning Numeracy: Binary Arithmetic with Neural Turing Machines

TL;DR

This work evaluates Neural Turing Machines (NTMs) on two fundamental algorithmic tasks—binary addition and multiplication—to assess their ability to learn and generalize sequence-based computations using differentiable external memory. By comparing feedforward and LSTM controllers against a strong LSTM baseline, the study shows that certain NTM variants, particularly FF-NTM, can generalize addition better than the baseline, while multiplication remains difficult for all configurations. The results highlight both the potential and limits of NTMs for learning algorithms, and suggest future work to analyze learned strategies and enhance memory addressing mechanisms. Overall, the paper demonstrates that differentiable memory can support algorithmic reasoning in neural models, with implications for scalable, generalizable sequence processing.

Abstract

One of the main problems encountered so far with recurrent neural networks is that they struggle to retain long-time information dependencies in their recurrent connections. Neural Turing Machines (NTMs) attempt to mitigate this issue by providing the neural network with an external portion of memory, in which information can be stored and manipulated later on. The whole mechanism is differentiable end-to-end, allowing the network to learn how to utilise this long-term memory via stochastic gradient descent. This allows NTMs to infer simple algorithms directly from data sequences. Nonetheless, the model can be hard to train due to a large number of parameters and interacting components and little related work is present. In this work we use NTMs to learn and generalise two arithmetical tasks: binary addition and multiplication. These tasks are two fundamental algorithmic examples in computer science, and are a lot more challenging than the previously explored ones, with which we aim to shed some light on the real capabilities on this neural model.

Paper Structure

This paper contains 10 sections, 10 equations, 11 figures, 2 tables, 2 algorithms.

Figures (11)

  • Figure 1: The schematic structure of a NTM. The controller receives both an external input and some data read from the memory, process that and produce an output sequence, also eventually storing some data on the memory.
  • Figure 2: Learning curves for the addition task.
  • Figure 3: Generalization error of the trained models.
  • Figure 5: Learning curves for the multiplication task.
  • Figure 6: Generalization error of the trained models.
  • ...and 6 more figures