Exploring the Necessity of Reasoning in LLM-based Agent Scenarios

Xueyang Zhou; Guiyao Tie; Guowen Zhang; Weidong Wang; Zhigang Zuo; Di Wu; Duanfeng Chu; Pan Zhou; Neil Zhenqiang Gong; Lichao Sun

Exploring the Necessity of Reasoning in LLM-based Agent Scenarios

Xueyang Zhou, Guiyao Tie, Guowen Zhang, Weidong Wang, Zhigang Zuo, Di Wu, Duanfeng Chu, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun

TL;DR

This work examines whether explicit reasoning is essential for LLM-based agents in the era of Large Reasoning Models (LRMs). It introduces LaRMA, a three-phase framework that segments tasks, evaluates generic agent paradigms (ReAct and Reflexion), and benchmarks diverse LLMs and LRMs across multiple datasets with rigorous metrics. Key findings show LRMs excel at reasoning-intensive tasks like Plan Design and Problem Solving, while LLMs outperform in execution-focused Tool Usage; hybrid actor-reflector configurations further enhance performance, especially under Reflexion. However, LRMs incur higher computational costs and exhibit behavioral challenges such as overthinking and fact-ignoring tendencies, motivating balanced, hybrid designs for practical agent systems with improved efficiency and reliability.

Abstract

The rise of Large Reasoning Models (LRMs) signifies a paradigm shift toward advanced computational reasoning. Yet, this progress disrupts traditional agent frameworks, traditionally anchored by execution-oriented Large Language Models (LLMs). To explore this transformation, we propose the LaRMA framework, encompassing nine tasks across Tool Usage, Plan Design, and Problem Solving, assessed with three top LLMs (e.g., Claude3.5-sonnet) and five leading LRMs (e.g., DeepSeek-R1). Our findings address four research questions: LRMs surpass LLMs in reasoning-intensive tasks like Plan Design, leveraging iterative reflection for superior outcomes; LLMs excel in execution-driven tasks such as Tool Usage, prioritizing efficiency; hybrid LLM-LRM configurations, pairing LLMs as actors with LRMs as reflectors, optimize agent performance by blending execution speed with reasoning depth; and LRMs' enhanced reasoning incurs higher computational costs, prolonged processing, and behavioral challenges, including overthinking and fact-ignoring tendencies. This study fosters deeper inquiry into LRMs' balance of deep thinking and overthinking, laying a critical foundation for future agent design advancements.

Exploring the Necessity of Reasoning in LLM-based Agent Scenarios

TL;DR

Abstract

Exploring the Necessity of Reasoning in LLM-based Agent Scenarios

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)