Table of Contents
Fetching ...

MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning

Nianqi Li, Zujie Liang, Siyu Yuan, Jiaqing Liang, Feng Wei, Yanghua Xiao

TL;DR

MultiLingPoT introduces a scalable, supervised fine-tuning approach to multilingual program-of-thought, enabling LLMs to solve math problems using PoTs in Python, C++, Java, and Matlab. By constructing a large multilingual PoT dataset and applying prior and posterior hybrid strategies, the method enables adaptive language selection per problem, achieving about a 2% improvement per language on simple tasks and up to a 6% gain over single-language PoT with data augmentation. The posterior mixing strategies, particularly Voting with Scorer, yield the strongest improvements on complex problems, and the approach generalizes across different base models with code capabilities. Overall, MultiLingPoT demonstrates that leveraging multiple languages and language-aware mixing substantially enhances mathematical reasoning in LLMs with a manageable increase in computation when using posterior strategies.

Abstract

Program-of-Thought (PoT), which aims to use programming language instead of natural language as an intermediate step in reasoning, is an important way for LLMs to solve mathematical problems. Since different programming languages excel in different areas, it is natural to use the most suitable language for solving specific problems. However, current PoT research only focuses on single language PoT, ignoring the differences between different programming languages. Therefore, this paper proposes an multilingual program reasoning method, MultiLingPoT. This method allows the model to answer questions using multiple programming languages by fine-tuning on multilingual data. Additionally, prior and posterior hybrid methods are used to help the model select the most suitable language for each problem. Our experimental results show that the training of MultiLingPoT improves each program's mathematical reasoning by about 2.5\%. Moreover, with proper mixing, the performance of MultiLingPoT can be further improved, achieving a 6\% increase compared to the single-language PoT with the data augmentation.Resources of this paper can be found at https://github.com/Nianqi-Li/MultiLingPoT.

MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning

TL;DR

MultiLingPoT introduces a scalable, supervised fine-tuning approach to multilingual program-of-thought, enabling LLMs to solve math problems using PoTs in Python, C++, Java, and Matlab. By constructing a large multilingual PoT dataset and applying prior and posterior hybrid strategies, the method enables adaptive language selection per problem, achieving about a 2% improvement per language on simple tasks and up to a 6% gain over single-language PoT with data augmentation. The posterior mixing strategies, particularly Voting with Scorer, yield the strongest improvements on complex problems, and the approach generalizes across different base models with code capabilities. Overall, MultiLingPoT demonstrates that leveraging multiple languages and language-aware mixing substantially enhances mathematical reasoning in LLMs with a manageable increase in computation when using posterior strategies.

Abstract

Program-of-Thought (PoT), which aims to use programming language instead of natural language as an intermediate step in reasoning, is an important way for LLMs to solve mathematical problems. Since different programming languages excel in different areas, it is natural to use the most suitable language for solving specific problems. However, current PoT research only focuses on single language PoT, ignoring the differences between different programming languages. Therefore, this paper proposes an multilingual program reasoning method, MultiLingPoT. This method allows the model to answer questions using multiple programming languages by fine-tuning on multilingual data. Additionally, prior and posterior hybrid methods are used to help the model select the most suitable language for each problem. Our experimental results show that the training of MultiLingPoT improves each program's mathematical reasoning by about 2.5\%. Moreover, with proper mixing, the performance of MultiLingPoT can be further improved, achieving a 6\% increase compared to the single-language PoT with the data augmentation.Resources of this paper can be found at https://github.com/Nianqi-Li/MultiLingPoT.

Paper Structure

This paper contains 31 sections, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Examples of different programming languages having different advantages. For the given question, the suitable language is easy to answer while using other languages will be more difficult.
  • Figure 2: The illustration of the implementation of the MultiLingPoT methodology, including data construction, model training and the hybrid strategies. Considering the diverse implementations of hybrid strategies, the "Think" part only represents the underlying logic of the hybrid strategy, but not its specific implementation.