Arithmetic with Language Models: from Memorization to Computation
Davide Maltoni, Matteo Ferrara
TL;DR
This work investigates how a small, non-pretrained Transformer LM can perform binary arithmetic beyond its training data, framing the computation as an Encoding-Regression-Decoding (ERD) process. Through controlled experiments on binary addition and multiplication, the authors show the model learns these tasks with strong generalization, and provide evidence that computation occurs as regression in a learned value space after an initial encoding and before a final decoding step. The study employs interpolation/extrapolation analyses, internal representation correlations, and amnesic probing to support the ERD account, and demonstrates the feasibility of numeric regression as a core mechanism in language models. The findings illuminate how LMs process numbers and suggest a generalizable pathway for integrating arithmetic and reasoning capabilities into practical NLP systems.
Abstract
A better understanding of the emergent computation and problem-solving capabilities of recent large language models is of paramount importance to further improve them and broaden their applicability. This work investigates how a language model, trained to predict the next token, can perform arithmetic computations generalizing beyond training data. Binary addition and multiplication constitute a good testbed for this purpose, since they require a very small vocabulary and exhibit relevant input/output discontinuities making smooth input interpolation ineffective for novel data. We successfully trained a light language model to learn these tasks and ran a number of experiments to investigate the extrapolation capabilities and internal information processing. Our findings support the hypothesis that the language model works as an Encoding-Regression-Decoding machine where the computation takes place in the value space once the input token representation is mapped to an appropriate internal representation.
