TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL

Tingcheng Bian; Jinchang Luo; Mingquan Cheng; Jinyu Zhang; Xiaoling Xia; Ni Li; Yan Tao; Haiwei Wang

TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL

Tingcheng Bian, Jinchang Luo, Mingquan Cheng, Jinyu Zhang, Xiaoling Xia, Ni Li, Yan Tao, Haiwei Wang

Abstract

Large language models achieve breakthroughs in complex reasoning via long chain-of-thought sequences. However, this often leads to severe reasoning inflation, causing substantial computational redundancy. To maximize Intelligence per Token, we introduce a theoretical metric, MSL-Minimal Sufficient Length. MSL rigorously characterizes the shortest reasoning length that preserves answer correctness. We provide a recursive definition based on independently sampled sequences and prove the existence of its limit, establishing the first measurable lower bound for reasoning-chain compression. Building on an analysis of mainstream CoT compression strategies, we identify key structural factors enabling a model to approach MSL. Based on these insights, we propose TRiMS which employs the GRPO algorithm in conjunction with MSL-based estimation during training, while mitigating instabilities during the training process through dynamic batch aggregation and advantage computation using batch-level standard deviation. TRiMS achieves over 80% CoT token reduction with a minor accuracy boost across all benchmarks.

TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL

Abstract

Paper Structure (39 sections, 6 equations, 12 figures, 2 tables)

This paper contains 39 sections, 6 equations, 12 figures, 2 tables.

Introduction
Importance:
Observation:
Motivation:
Contributions:
Related Work
Inference-time Strategies
Learning-based Compression
Minimal Sufficient Length
Definition of Minimal Sufficient Length
Existence of Minimal Sufficient Length Under Different Sampling Strategies
Existence of Minimal Sufficient Length Across Model Scales
TRiMS
Rationale
Why Use RL
...and 24 more sections

Figures (12)

Figure 1: (Top) Conceptual illustration of diverse reasoning paths generated by LLM. (Bottom) Expected token length of the shortest correct reasoning path(SCPT@K) as a function of the sampling times $k$.
Figure 2: Expected token length of the shortest correct reasoning path versus sample number $k$. (a–d) Different sampling strategies. (e) Model scaling under varying difficulty levels.
Figure 3: Overview of the TRiMS framework
Figure 4: Percentage of degenerate groups across training steps
Figure 5: Proportion of correct answers distribution across different length thresholds and difficulty levels.
...and 7 more figures

TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL

Abstract

TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL

Authors

Abstract

Table of Contents

Figures (12)