The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
Tanja Baeumel, Josef van Genabith, Simon Ostermann
TL;DR
This paper investigates why autoregressive LLMs struggle with multi-operand addition, attributing the difficulty to a shallow one-digit lookahead that fails to anticipate cascading carries. Through probing experiments, formalization of left-to-right addition, and controlled multi-operand datasets, the authors show that the carry state is not reliably represented when only a single-digit lookahead is available, and that this limitation persists across tokenization schemes. They introduce a formal carry heuristic H1, demonstrate its predictive power for both two- and multi-operand addition, and provide empirical evidence that model accuracy deteriorates in proportion to the number of operands. The findings reveal a fundamental limitation in current LLMs for complex numerical reasoning and point to deeper lookahead as a promising direction to enhance arithmetic capabilities with practical implications for numerical tasks and algorithmic reasoning.”
Abstract
Autoregressive large language models (LLMs) exhibit impressive performance across various tasks but struggle with simple arithmetic, such as addition of two or more operands. We show that this struggle arises from LLMs' use of a simple one-digit lookahead heuristic, which works fairly well (but not perfect) for two-operand addition but fails in multi-operand cases, where the carry-over logic is more complex. Our probing experiments and digit-wise accuracy evaluation show that LLMs fail precisely where a one-digit lookahead is insufficient to account for cascading carries. We analyze the impact of tokenization strategies on arithmetic performance and show that all investigated models, regardless of tokenization, are inherently limited in the addition of multiple operands due to their reliance on a one-digit lookahead heuristic. Our findings reveal fundamental limitations that prevent LLMs from generalizing to more complex numerical reasoning.
