Table of Contents
Fetching ...

A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions

Chengyu Wang, Taolin Zhang, Richang Hong, Jun Huang

TL;DR

This survey surveys approximately 170 recent works on small reasoning models (SRMs), contrasting them with large reasoning models (LRMs) and focusing on training pipelines, inference techniques, and domain-specific applications. It categorizes SRMs into typical domain-specific backbones and general-purpose, distillation-based backbones, and details training approaches ranging from data annotation and supervised fine-tuning to reinforcement learning with process and outcome rewards. The paper also reviews inference-scale strategies such as chain-of-thought prompting, tree-based reasoning (ToT), and multi-agent architectures, along with their applicability to healthcare, science, and other domains. By outlining future directions—enhanced distillation, adaptive RL, low-resource learning, and efficient inference—the authors argue for the practical and sustainable deployment of SRMs across resource-constrained settings.

Abstract

Recently, the reasoning capabilities of large reasoning models (LRMs), such as DeepSeek-R1, have seen significant advancements through the slow thinking process. Despite these achievements, the substantial computational demands of LRMs present considerable challenges. In contrast, small reasoning models (SRMs), often distilled from larger ones, offer greater efficiency and can exhibit distinct capabilities and cognitive trajectories compared to LRMs. This work surveys around 170 recently published papers on SRMs for tackling various complex reasoning tasks. We review the current landscape of SRMs and analyze diverse training and inference techniques related to SRMs. Furthermore, we provide a comprehensive review of SRMs for domain-specific applications and discuss possible future research directions. This survey serves as an essential reference for researchers to leverage or develop SRMs for advanced reasoning functionalities with high efficiency.

A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions

TL;DR

This survey surveys approximately 170 recent works on small reasoning models (SRMs), contrasting them with large reasoning models (LRMs) and focusing on training pipelines, inference techniques, and domain-specific applications. It categorizes SRMs into typical domain-specific backbones and general-purpose, distillation-based backbones, and details training approaches ranging from data annotation and supervised fine-tuning to reinforcement learning with process and outcome rewards. The paper also reviews inference-scale strategies such as chain-of-thought prompting, tree-based reasoning (ToT), and multi-agent architectures, along with their applicability to healthcare, science, and other domains. By outlining future directions—enhanced distillation, adaptive RL, low-resource learning, and efficient inference—the authors argue for the practical and sustainable deployment of SRMs across resource-constrained settings.

Abstract

Recently, the reasoning capabilities of large reasoning models (LRMs), such as DeepSeek-R1, have seen significant advancements through the slow thinking process. Despite these achievements, the substantial computational demands of LRMs present considerable challenges. In contrast, small reasoning models (SRMs), often distilled from larger ones, offer greater efficiency and can exhibit distinct capabilities and cognitive trajectories compared to LRMs. This work surveys around 170 recently published papers on SRMs for tackling various complex reasoning tasks. We review the current landscape of SRMs and analyze diverse training and inference techniques related to SRMs. Furthermore, we provide a comprehensive review of SRMs for domain-specific applications and discuss possible future research directions. This survey serves as an essential reference for researchers to leverage or develop SRMs for advanced reasoning functionalities with high efficiency.

Paper Structure

This paper contains 18 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: A simple comparison between representative LLMs and SRMs on various reasoning benchmarks.
  • Figure 2: The roadmap of this survey.