TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking
Yongqi Fan, Xiaoyang Chen, Dezhi Ye, Jie Liu, Haijin Liang, Jin Ma, Ben He, Yingfei Sun, Tong Ruan
TL;DR
TFRank tackles the problem of deploying reasoning-enabled ranking under strict efficiency constraints by training small-scale LLMs with multi-task supervision that includes Chain-of-Thought signals, then deploying a Think-Free inference mode that outputs direct relevance scores. The approach combines pointwise ranking with CoT distillation, fine-grained supervision, and optional GRPO optimization to achieve high accuracy with far lower latency than CoT-dependent baselines. Empirical results on BRIGHT show small backbones (as low as 1.7B params) rivaling much larger baselines, while BEIR results demonstrate competitive zero-shot generalization and strong efficiency gains. The Think-Free phenomenon indicates that reasoning ability learned during training can be effectively invoked at inference without producing explicit reasoning chains, enabling practical production deployments in real-world IR systems.
Abstract
Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress. However, existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning, resulting in high computational cost and latency that limit real-world use. To address this, we propose \textbf{TFRank}, an efficient pointwise reasoning ranker based on small-scale LLMs. To improve ranking performance, TFRank effectively integrates CoT data, fine-grained score supervision, and multi-task training. Furthermore, it achieves an efficient ``\textbf{T}hink-\textbf{F}ree" reasoning capability by employing a ``think-mode switch'' and pointwise format constraints. Specifically, this allows the model to leverage explicit reasoning during training while delivering precise relevance scores for complex queries at inference without generating any reasoning chains. Experiments show that TFRank achieves performance comparable to models with four times more parameters on the BRIGHT benchmark and demonstrates strong competitiveness on the BEIR benchmark. Further analysis shows that TFRank achieves an effective balance between performance and efficiency, providing a practical solution for integrating advanced reasoning into real-world systems. Our code and data are released in the repository: https://github.com/JOHNNY-fans/TFRank.
