Table of Contents
Fetching ...

A Survey of Scaling in Large Language Model Reasoning

Zihan Chen, Song Wang, Zhen Tan, Xingbo Fu, Zhenyu Lei, Peng Wang, Huan Liu, Cong Shen, Jundong Li

TL;DR

This survey analyzes how scaling strategies affect reasoning in large language models across four dimensions: input-size expansion, deeper reasoning steps, iterative reasoning rounds, and model-optimization techniques. It surveys methods like ICL, RAG, memory augmentation, chain-of-thought and tree-based reasoning, multi-agent debates, human-in-the-loop interactions, and RL/latent-space approaches, highlighting benefits, trade-offs, and stability concerns. Key contributions include a taxonomy of scaling methods, insights into diminishing returns and risk factors (efficiency, inverse scaling, security), and guidance for adaptive, robust deployment. The work informs future research and practical design of scalable, trustworthy reasoning systems across AI research, production, and scientific domains.

Abstract

The rapid advancements in large Language models (LLMs) have significantly enhanced their reasoning capabilities, driven by various strategies such as multi-agent collaboration. However, unlike the well-established performance improvements achieved through scaling data and model size, the scaling of reasoning in LLMs is more complex and can even negatively impact reasoning performance, introducing new challenges in model alignment and robustness. In this survey, we provide a comprehensive examination of scaling in LLM reasoning, categorizing it into multiple dimensions and analyzing how and to what extent different scaling strategies contribute to improving reasoning capabilities. We begin by exploring scaling in input size, which enables LLMs to process and utilize more extensive context for improved reasoning. Next, we analyze scaling in reasoning steps that improves multi-step inference and logical consistency. We then examine scaling in reasoning rounds, where iterative interactions refine reasoning outcomes. Furthermore, we discuss scaling in training-enabled reasoning, focusing on optimization through iterative model improvement. Finally, we review applications of scaling across domains and outline future directions for further advancing LLM reasoning. By synthesizing these diverse perspectives, this survey aims to provide insights into how scaling strategies fundamentally enhance the reasoning capabilities of LLMs and further guide the development of next-generation AI systems.

A Survey of Scaling in Large Language Model Reasoning

TL;DR

This survey analyzes how scaling strategies affect reasoning in large language models across four dimensions: input-size expansion, deeper reasoning steps, iterative reasoning rounds, and model-optimization techniques. It surveys methods like ICL, RAG, memory augmentation, chain-of-thought and tree-based reasoning, multi-agent debates, human-in-the-loop interactions, and RL/latent-space approaches, highlighting benefits, trade-offs, and stability concerns. Key contributions include a taxonomy of scaling methods, insights into diminishing returns and risk factors (efficiency, inverse scaling, security), and guidance for adaptive, robust deployment. The work informs future research and practical design of scalable, trustworthy reasoning systems across AI research, production, and scientific domains.

Abstract

The rapid advancements in large Language models (LLMs) have significantly enhanced their reasoning capabilities, driven by various strategies such as multi-agent collaboration. However, unlike the well-established performance improvements achieved through scaling data and model size, the scaling of reasoning in LLMs is more complex and can even negatively impact reasoning performance, introducing new challenges in model alignment and robustness. In this survey, we provide a comprehensive examination of scaling in LLM reasoning, categorizing it into multiple dimensions and analyzing how and to what extent different scaling strategies contribute to improving reasoning capabilities. We begin by exploring scaling in input size, which enables LLMs to process and utilize more extensive context for improved reasoning. Next, we analyze scaling in reasoning steps that improves multi-step inference and logical consistency. We then examine scaling in reasoning rounds, where iterative interactions refine reasoning outcomes. Furthermore, we discuss scaling in training-enabled reasoning, focusing on optimization through iterative model improvement. Finally, we review applications of scaling across domains and outline future directions for further advancing LLM reasoning. By synthesizing these diverse perspectives, this survey aims to provide insights into how scaling strategies fundamentally enhance the reasoning capabilities of LLMs and further guide the development of next-generation AI systems.

Paper Structure

This paper contains 21 sections, 1 figure.

Figures (1)

  • Figure 1: Taxonomy for Scaling in Large Language Model Reasoning.