Table of Contents
Fetching ...

A Survey of Multilingual Reasoning in Language Models

Akash Ghosh, Debayan Datta, Sriparna Saha, Chirag Agarwal

TL;DR

This survey investigates how language models can perform logical reasoning across multiple languages, highlighting cross-lingual misalignment, data scarcity, and cultural considerations as key obstacles. It organizes the field through a taxonomy of methods (representation alignment, finetuning, prompting, and model editing), catalogues multilingual datasets and benchmarks, and analyzes evaluation metrics and benchmark performance. The authors identify gaps in domain coverage, language diversity, and efficient, scalable reasoning, and propose concrete directions including cross-lingual transfer, explainable reasoning, unified metrics, and multimodal multilingual tasks. Overall, the work provides a structured roadmap for advancing multilingual reasoning in LLMs with practical implications for inclusive, culturally aware AI systems.

Abstract

While reasoning and multilingual capabilities in language models (LMs) have achieved remarkable progress in recent years, their integration into a unified paradigm - multilingual reasoning - is at a nascent stage. Multilingual reasoning requires language models to handle logical reasoning across languages while addressing misalignment, biases, and challenges in low-resource settings. This survey provides the first in-depth review of multilingual reasoning in LMs. In this survey, we provide a systematic overview of existing methods that leverage LMs for multilingual reasoning, specifically outlining the challenges, motivations, and foundational aspects of applying language models to reason across diverse languages. We provide an overview of the standard data resources used for training multilingual reasoning in LMs and the evaluation benchmarks employed to assess their multilingual capabilities. Next, we analyze various state-of-the-art methods and their performance on these benchmarks. Finally, we explore future research opportunities to improve multilingual reasoning in LMs, focusing on enhancing their ability to handle diverse languages and complex reasoning tasks. Rapid growth of evolving developments in this field can be actively tracked on our project page: [https://github.com/AkashGhosh/Survey-of-Multilingual-Reasoning-in-Language-Models](https://github.com/AkashGhosh/Survey-of-Multilingual-Reasoning-in-Language-Models)

A Survey of Multilingual Reasoning in Language Models

TL;DR

This survey investigates how language models can perform logical reasoning across multiple languages, highlighting cross-lingual misalignment, data scarcity, and cultural considerations as key obstacles. It organizes the field through a taxonomy of methods (representation alignment, finetuning, prompting, and model editing), catalogues multilingual datasets and benchmarks, and analyzes evaluation metrics and benchmark performance. The authors identify gaps in domain coverage, language diversity, and efficient, scalable reasoning, and propose concrete directions including cross-lingual transfer, explainable reasoning, unified metrics, and multimodal multilingual tasks. Overall, the work provides a structured roadmap for advancing multilingual reasoning in LLMs with practical implications for inclusive, culturally aware AI systems.

Abstract

While reasoning and multilingual capabilities in language models (LMs) have achieved remarkable progress in recent years, their integration into a unified paradigm - multilingual reasoning - is at a nascent stage. Multilingual reasoning requires language models to handle logical reasoning across languages while addressing misalignment, biases, and challenges in low-resource settings. This survey provides the first in-depth review of multilingual reasoning in LMs. In this survey, we provide a systematic overview of existing methods that leverage LMs for multilingual reasoning, specifically outlining the challenges, motivations, and foundational aspects of applying language models to reason across diverse languages. We provide an overview of the standard data resources used for training multilingual reasoning in LMs and the evaluation benchmarks employed to assess their multilingual capabilities. Next, we analyze various state-of-the-art methods and their performance on these benchmarks. Finally, we explore future research opportunities to improve multilingual reasoning in LMs, focusing on enhancing their ability to handle diverse languages and complex reasoning tasks. Rapid growth of evolving developments in this field can be actively tracked on our project page: [https://github.com/AkashGhosh/Survey-of-Multilingual-Reasoning-in-Language-Models](https://github.com/AkashGhosh/Survey-of-Multilingual-Reasoning-in-Language-Models)

Paper Structure

This paper contains 18 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Taxonomy tree of current Multilingual Reasoning Research. The thrusts for improving multilingual reasoning mainly include representation learning, fine-tuning, prompting, and model editing. With the emergence of multilingual LLMs, while initial research focused on naive prompting, recent works propose several alignment, editing, and fine-tuning strategies to improve reasoning in multilingual LLMs.
  • Figure 2: Language distribution across training corpora and benchmarks for multilingual reasoning. The y-axis denotes the number of training corpora/benchmark datasets that include a given language (x-axis). We observe a long-tail distribution, denoting that current datasets predominantly cover languages like Chinese, English, French, and German, highlighting the need for benchmarks that represent long-tail languages.
  • Figure 3: Distribution of multilingual reasoning datasets. We find that datasets predominantly comprise logical, commonsense, and math reasoning, and the community needs benchmarks to include compositional and tabular reasoning.
  • Figure 4: Distribution of domains in multilingual reasoning datasets. While legal, commonsense, and math domain dataset cover up to 54% of current multilingual reasoning research, other under-explored domains include ethics, science, visual, and compositional.
  • Figure 5: Taxonomy of Multilingual Reasoning Methods. A taxonomy of approaches for enhancing multilingual reasoning in models, covering (A) Representation Alignment, (B) Finetuning, (C) Prompting, and (D) Model Editing.
  • ...and 1 more figures