Table of Contents
Fetching ...

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

Junyu Lai, Jiahe Xu, Yao Yang, Yunpeng Huang, Chun Cao, Jingwei Xu

TL;DR

A Composable Arithmetic Execution Framework (CAEF) is proposed that enables LLMs to learn to execute step-by-step computations by emulating Turing Machines, thereby gaining a genuine understanding of computational logic.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing and reasoning tasks. However, their performance in the foundational domain of arithmetic remains unsatisfactory. When dealing with arithmetic tasks, LLMs often memorize specific examples rather than learning the underlying computational logic, limiting their ability to generalize to new problems. In this paper, we propose a Composable Arithmetic Execution Framework (CAEF) that enables LLMs to learn to execute step-by-step computations by emulating Turing Machines, thereby gaining a genuine understanding of computational logic. Moreover, the proposed framework is highly scalable, allowing composing learned operators to significantly reduce the difficulty of learning complex operators. In our evaluation, CAEF achieves nearly 100% accuracy across seven common mathematical operations on the LLaMA 3.1-8B model, effectively supporting computations involving operands with up to 100 digits, a level where GPT-4o falls short noticeably in some settings.

Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines

TL;DR

A Composable Arithmetic Execution Framework (CAEF) is proposed that enables LLMs to learn to execute step-by-step computations by emulating Turing Machines, thereby gaining a genuine understanding of computational logic.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing and reasoning tasks. However, their performance in the foundational domain of arithmetic remains unsatisfactory. When dealing with arithmetic tasks, LLMs often memorize specific examples rather than learning the underlying computational logic, limiting their ability to generalize to new problems. In this paper, we propose a Composable Arithmetic Execution Framework (CAEF) that enables LLMs to learn to execute step-by-step computations by emulating Turing Machines, thereby gaining a genuine understanding of computational logic. Moreover, the proposed framework is highly scalable, allowing composing learned operators to significantly reduce the difficulty of learning complex operators. In our evaluation, CAEF achieves nearly 100% accuracy across seven common mathematical operations on the LLaMA 3.1-8B model, effectively supporting computations involving operands with up to 100 digits, a level where GPT-4o falls short noticeably in some settings.

Paper Structure

This paper contains 31 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: An illustrative CAEF flowchart demonstrates the execution of the Multiplication operation for $89 \times 2$. The aligner converts the original arithmetic expression into a Turing Machine-like representation that the Multiplication executor can process. Acting as an executor composer, the Multiplication executor calls upon two basic executors, i.e., Less_than and Addition, to perform the actual computation. All the executors and the aligner are executed by the LLM.
  • Figure 2: Diagram of the CAEF framework. The CAEF representation includes two required components: state and command, corresponding to areas and in the figure. The state part records the current status, operands, and registers that store intermediate variables and results, etc. The command consists of a set of actions, such as write operations and call operations. Upon receiving the state and command, the LLM generates the next state and the corresponding command, with each step corresponding to a transition in the state diagram on the left.
  • Figure 3: Execution process of $45 + 67$. The state diagram on the left abstracts the addition process. In step ②, a one-digit addition is performed, followed by updating the carry and output. The right side shows the actual sequence of state and command execution in the CAEF framework.
  • Figure 4: Execution process of $89 \times 2$. The state diagram on the left abstracts the multiplication process, where in state $q_1$, the less-than executor is performed. If true, the execution moves to state $q_2$; otherwise, it transitions to state $q_5$ and halts. Steps ③ and ④ execute the accumulation of the counter and output, respectively. The right side shows the actual execution in the CAEF framework.
  • Figure :
  • ...and 1 more figures