A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang, Tian-Shuo Liu, Chenyang Wang, Yi-Di Wang, Shu Yan, Cheng-Xing Jia, Xu-Hui Liu, Xin-Wei Chen, Jia-Cheng Xu, Ziniu Li, Yang Yu
TL;DR
The paper surveys large language model–based mathematical reasoning through a two-phase lens of comprehension and answer generation, detailing how pretraining, supervised fine-tuning, and reinforcement learning shape reasoning abilities. It analyzes prompting strategies, diverse mathematical representations, and the pivotal role of Chain-of-Thought in producing structured, step-by-step solutions, while outlining methods to boost reasoning via data construction, RL, and inference-time search. The authors highlight practical challenges—data quality, reward signal design, exploration efficiency, and generalization to open domains—and discuss directions such as knowledge augmentation, external tools, and formal reasoning frameworks. Together, these insights aim to guide researchers and practitioners in advancing reasoning capabilities across domains and toward more robust, general-purpose AI systems.
Abstract
Mathematical reasoning has long represented one of the most fundamental and challenging frontiers in artificial intelligence research. In recent years, large language models (LLMs) have achieved significant advances in this area. This survey examines the development of mathematical reasoning abilities in LLMs through two high-level cognitive phases: comprehension, where models gain mathematical understanding via diverse pretraining strategies, and answer generation, which has progressed from direct prediction to step-by-step Chain-of-Thought (CoT) reasoning. We review methods for enhancing mathematical reasoning, ranging from training-free prompting to fine-tuning approaches such as supervised fine-tuning and reinforcement learning, and discuss recent work on extended CoT and "test-time scaling". Despite notable progress, fundamental challenges remain in terms of capacity, efficiency, and generalization. To address these issues, we highlight promising research directions, including advanced pretraining and knowledge augmentation techniques, formal reasoning frameworks, and meta-generalization through principled learning paradigms. This survey tries to provide some insights for researchers interested in enhancing reasoning capabilities of LLMs and for those seeking to apply these techniques to other domains.
