A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang, Taolin Zhang, Richang Hong, Jun Huang
TL;DR
This survey surveys approximately 170 recent works on small reasoning models (SRMs), contrasting them with large reasoning models (LRMs) and focusing on training pipelines, inference techniques, and domain-specific applications. It categorizes SRMs into typical domain-specific backbones and general-purpose, distillation-based backbones, and details training approaches ranging from data annotation and supervised fine-tuning to reinforcement learning with process and outcome rewards. The paper also reviews inference-scale strategies such as chain-of-thought prompting, tree-based reasoning (ToT), and multi-agent architectures, along with their applicability to healthcare, science, and other domains. By outlining future directions—enhanced distillation, adaptive RL, low-resource learning, and efficient inference—the authors argue for the practical and sustainable deployment of SRMs across resource-constrained settings.
Abstract
Recently, the reasoning capabilities of large reasoning models (LRMs), such as DeepSeek-R1, have seen significant advancements through the slow thinking process. Despite these achievements, the substantial computational demands of LRMs present considerable challenges. In contrast, small reasoning models (SRMs), often distilled from larger ones, offer greater efficiency and can exhibit distinct capabilities and cognitive trajectories compared to LRMs. This work surveys around 170 recently published papers on SRMs for tackling various complex reasoning tasks. We review the current landscape of SRMs and analyze diverse training and inference techniques related to SRMs. Furthermore, we provide a comprehensive review of SRMs for domain-specific applications and discuss possible future research directions. This survey serves as an essential reference for researchers to leverage or develop SRMs for advanced reasoning functionalities with high efficiency.
