Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers
Marek Kadlčík, Michal Štefánik, Timothee Mickus, Michal Spiegel, Josef Kuchař
TL;DR
Pretrained language models encode numeric information with a highly structured, often sinusoidal, embedding pattern. The authors introduce sin-based and Fourier-inspired probes that decode integers from number embeddings with near-perfect accuracy across multiple open-source LMs, revealing that exact numeric representations survive pretraining. They show that the precision of these numeric embeddings explains a substantial portion of elementary arithmetic errors and that aligning embeddings to the discovered pattern can reduce such errors. This work challenges the notion that LMs cannot reliably represent exact quantities and demonstrates that targeted embedding interventions can improve numeracy without external calculation tools.
Abstract
Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure of learned number embeddings with sinusoidal patterns. In response, we propose a novel probing technique that decodes numeric values from input embeddings with near-perfect accuracy across a range of open-source LMs. This proves that after the sole pre-training, LMs represent numbers with remarkable precision. Finally, we find that the embeddings' precision, judged by our probe's accuracy, explains a large portion of LM's errors in elementary arithmetic, and show that aligning the embeddings with the pattern our probes discover can mitigate these errors.
