Pre-trained Large Language Models Use Fourier Features to Compute Addition
Tianyi Zhou, Deqing Fu, Vatsal Sharan, Robin Jia
TL;DR
The paper reveals that pre-trained large language models compute addition not by memorization but through Fourier features embedded in hidden representations. It shows a division of labor where MLPs primarily approximate the magnitude using low-frequency components, while attention modules perform modular addition using high-frequency components, with both components ultimately summing to yield the correct result. Pre-training is shown to be crucial, as models trained from scratch lack these Fourier features unless pre-trained embeddings are provided; this inductive bias also transfers to in-context learning. The findings offer a mechanistic, frequency-domain view of arithmetic in Transformer models and suggest how pre-training shapes capabilities for algorithmic tasks, with implications for prompting and model design.
Abstract
Pre-trained large language models (LLMs) exhibit impressive mathematical reasoning capabilities, yet how they compute basic arithmetic, such as addition, remains unclear. This paper shows that pre-trained LLMs add numbers using Fourier features -- dimensions in the hidden state that represent numbers via a set of features sparse in the frequency domain. Within the model, MLP and attention layers use Fourier features in complementary ways: MLP layers primarily approximate the magnitude of the answer using low-frequency features, while attention layers primarily perform modular addition (e.g., computing whether the answer is even or odd) using high-frequency features. Pre-training is crucial for this mechanism: models trained from scratch to add numbers only exploit low-frequency features, leading to lower accuracy. Introducing pre-trained token embeddings to a randomly initialized model rescues its performance. Overall, our analysis demonstrates that appropriate pre-trained representations (e.g., Fourier features) can unlock the ability of Transformers to learn precise mechanisms for algorithmic tasks.
