Table of Contents
Fetching ...

Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective

Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Ziyi Yang, Mahmoud Khademi, Hany Awadalla, Junjie Wang, Yujiu Yang, Furu Wei

TL;DR

The paper tackles the limitation of single-paradigm reasoning in large language models for mathematical tasks by introducing Chain-of-Reasoning (CoR), which unifies Natural Language Reasoning ($NLR$), Algorithmic Reasoning ($AR$), and Symbolic Reasoning ($SR$). It introduces the Multi-Paradigm Math (MPM) dataset and the Progressive Paradigm Training (PPT) curriculum to enable a model (CoR-Math-7B) to master all three paradigms and synthesize their outputs into accurate solutions. Across five challenging benchmarks spanning arithmetic and theorem proving, CoR-Math-7B achieves state-of-the-art zero-shot performance, including a $41.0\%$ absolute improvement over GPT-4o on miniF2F and a significant edge on MATH, while maintaining efficiency through multi-paradigm test-time inference (SMPS). The work demonstrates that cross-paradigm collaboration yields superior generalization and efficiency, proposing a new direction for scalable, unified mathematical reasoning in LLMs.

Abstract

Large Language Models (LLMs) have made notable progress in mathematical reasoning, yet often rely on single-paradigm reasoning, limiting their effectiveness across diverse tasks. We introduce Chain-of-Reasoning (CoR), a novel unified framework integrating multiple reasoning paradigms--Natural Language Reasoning (NLR), Algorithmic Reasoning (AR), and Symbolic Reasoning (SR)--to enable synergistic collaboration. CoR generates multiple potential answers via different reasoning paradigms and synthesizes them into a coherent final solution. We propose a Progressive Paradigm Training (PPT) strategy for models to progressively master these paradigms, leading to CoR-Math-7B. Experimental results demonstrate that CoR-Math-7B significantly outperforms current SOTA models, achieving up to a 41.0% absolute improvement over GPT-4o in theorem proving and a 15.0% improvement over RL-based methods on the MATH benchmark in arithmetic tasks. These results show the enhanced mathematical comprehension ability of our model, enabling zero-shot generalization across tasks.

Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective

TL;DR

The paper tackles the limitation of single-paradigm reasoning in large language models for mathematical tasks by introducing Chain-of-Reasoning (CoR), which unifies Natural Language Reasoning (), Algorithmic Reasoning (), and Symbolic Reasoning (). It introduces the Multi-Paradigm Math (MPM) dataset and the Progressive Paradigm Training (PPT) curriculum to enable a model (CoR-Math-7B) to master all three paradigms and synthesize their outputs into accurate solutions. Across five challenging benchmarks spanning arithmetic and theorem proving, CoR-Math-7B achieves state-of-the-art zero-shot performance, including a absolute improvement over GPT-4o on miniF2F and a significant edge on MATH, while maintaining efficiency through multi-paradigm test-time inference (SMPS). The work demonstrates that cross-paradigm collaboration yields superior generalization and efficiency, proposing a new direction for scalable, unified mathematical reasoning in LLMs.

Abstract

Large Language Models (LLMs) have made notable progress in mathematical reasoning, yet often rely on single-paradigm reasoning, limiting their effectiveness across diverse tasks. We introduce Chain-of-Reasoning (CoR), a novel unified framework integrating multiple reasoning paradigms--Natural Language Reasoning (NLR), Algorithmic Reasoning (AR), and Symbolic Reasoning (SR)--to enable synergistic collaboration. CoR generates multiple potential answers via different reasoning paradigms and synthesizes them into a coherent final solution. We propose a Progressive Paradigm Training (PPT) strategy for models to progressively master these paradigms, leading to CoR-Math-7B. Experimental results demonstrate that CoR-Math-7B significantly outperforms current SOTA models, achieving up to a 41.0% absolute improvement over GPT-4o in theorem proving and a 15.0% improvement over RL-based methods on the MATH benchmark in arithmetic tasks. These results show the enhanced mathematical comprehension ability of our model, enabling zero-shot generalization across tasks.
Paper Structure (36 sections, 7 equations, 12 figures, 9 tables)

This paper contains 36 sections, 7 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: A comprehensive comparative analysis of CoR-Math-7B and baseline models across mathematical tasks. (a) shows the effectiveness of CoR-Math-7B (zero-shot) in theorem proving tasks. (b) shows a resource-efficiency analysis for arithmetic computation tasks, where CoR-Math-7B achieves optimal resource efficiency and near-optimal zero-shot performance.
  • Figure 2: The reasoning process under different paradigms: (a) In single-paradigm reasoning, each reasoning step relies on the same knowledge medium, such as Natural Language (NL), algorithms, or symbols. (b) In tool-integrated single-paradigm, NL is used for reasoning, while code assists in solving specific sub-problems. After obtaining the execution results, the reasoning continues using NL. (c) The proposed CoR reasoning framework, along with several examples, shows that multi-paradigm reasoning allows for varying reasoning depths to address different types of problems.
  • Figure 3: An overview of (a) the Multi-Paradigm Math (MPM) dataset construction process, involving reconstruction, extension, and theorem prover verification, and (b) the Progressive Paradigm Training (PPT) method, where the model is trained with increasing reasoning paradigms in stages.
  • Figure 4: An evaluation of the effectiveness of the PPT strategy. We present the zero-shot Pass@1 results on the MATH and GSM8k benchmarks across three cumulative stages of the PPT strategy. The results highlight the PPT strategy's cumulative effectiveness, showing increased performance with each progressive stage.
  • Figure 5: An overview of the reasoning hierarchical structure. (a) A single reasoning paradigm depicts multiple distinct reasoning paths. (b) An example of single-paradigm reasoning includes one reasoning path, which contains some reasoning steps. (c) Multi-paradigm reasoning includes several distinct reasoning paradigms.
  • ...and 7 more figures