Table of Contents
Fetching ...

MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Zixian Huang, Wenhao Zhu, Gong Cheng, Lei Li, Fei Yuan

TL;DR

MindMerger addresses the gap in multilingual reasoning by preserving LLMs' built-in reasoning and language understanding and augmenting them with external multilingual model capabilities. It introduces a two-stage training scheme that first embeds external language understanding into the LLM representation and then enables collaborative use of external and built-in capabilities, without updating LLM parameters. Empirically, MindMerger-Soft achieves consistent improvements across MGSM, MSVAMP, X-CSQA, and XNLI, with average gains of $6.7\%$ overall and $8.0\%$ in low-resource languages on MGSM, and outperforms replacement-based baselines by significant margins. The work highlights that encoder-based multilingual models and alignment-based representation merging are effective for cross-language reasoning, and demonstrates robustness across multiple LLMs, suggesting broad applicability to multilingual AI tasks.

Abstract

Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understanding non-English. Unfortunately, these methods often underutilize the built-in skilled reasoning and useful language understanding capabilities of LLMs. In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance. Furthermore, a two-step training scheme is introduced to first train to embeded the external capabilities into LLMs and then train the collaborative utilization of the external capabilities and the built-in capabilities in LLMs. Experiments on three multilingual reasoning datasets and a language understanding dataset demonstrate that MindMerger consistently outperforms all baselines, especially in low-resource languages. Without updating the parameters of LLMs, the average accuracy improved by 6.7% and 8.0% across all languages and low-resource languages on the MGSM dataset, respectively.

MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

TL;DR

MindMerger addresses the gap in multilingual reasoning by preserving LLMs' built-in reasoning and language understanding and augmenting them with external multilingual model capabilities. It introduces a two-stage training scheme that first embeds external language understanding into the LLM representation and then enables collaborative use of external and built-in capabilities, without updating LLM parameters. Empirically, MindMerger-Soft achieves consistent improvements across MGSM, MSVAMP, X-CSQA, and XNLI, with average gains of overall and in low-resource languages on MGSM, and outperforms replacement-based baselines by significant margins. The work highlights that encoder-based multilingual models and alignment-based representation merging are effective for cross-language reasoning, and demonstrates robustness across multiple LLMs, suggesting broad applicability to multilingual AI tasks.

Abstract

Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understanding non-English. Unfortunately, these methods often underutilize the built-in skilled reasoning and useful language understanding capabilities of LLMs. In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance. Furthermore, a two-step training scheme is introduced to first train to embeded the external capabilities into LLMs and then train the collaborative utilization of the external capabilities and the built-in capabilities in LLMs. Experiments on three multilingual reasoning datasets and a language understanding dataset demonstrate that MindMerger consistently outperforms all baselines, especially in low-resource languages. Without updating the parameters of LLMs, the average accuracy improved by 6.7% and 8.0% across all languages and low-resource languages on the MGSM dataset, respectively.
Paper Structure (28 sections, 7 equations, 4 figures, 16 tables)

This paper contains 28 sections, 7 equations, 4 figures, 16 tables.

Figures (4)

  • Figure 1: Examples of multilingual mathematical reasoning from the MGSM dataset. LLM can generate correct and incorrect answers when asked in different languages.
  • Figure 2: Overview of the model structure and training scheme of MindMerger, which consists of an LLM (blue) and a external model (yellow) and is trained by a two-stage scheme.
  • Figure 3: Ablation experiments of MindMerger-Soft on the MGSM dataset. Lrl., Hrl., and Avg. represent the average accuracy across low-resource languages, high-resource languages, and all languages, respectively. Referring to mgsm, we regard Bn, Th, and Sw as low-resourse languages, and regard the remaining languages as high-resource languages.
  • Figure 4: T-SNE visualization in the spaces of the LLM embeddings and mapping layer outputs.