Table of Contents
Fetching ...

Is Depth All You Need? An Exploration of Iterative Reasoning in LLMs

Zongqian Wu, Tianyu Li, Baoduo Xu, Jiaying Yang, Mengmeng Zhan, Xiaofeng Zhu, Lei Feng

TL;DR

This work investigates whether breadth reasoning can substitute for deep iterative reasoning in chain-of-thought (CoT) prompting for large language models. Through theoretical analysis and large-scale experiments, the authors show that generating diverse initial reasoning paths and aggregating predictions—via contextual reformulation and self-consistency—can outperform deep iterative approaches, especially on symbolic and commonsense tasks, while maintaining competitive performance on arithmetic problems. They identify factors that influence reasoning diversity, quantify their impact with entropy analysis, and propose a practical method (QuestionC-SC and PromptC-SC) that extends reasoning breadth with reduced sampling randomness. The findings suggest that reasoning diversity is a viable, potentially more cost-effective alternative to iterative refinement, with implications for designing more robust CoT strategies and hybrid depth-breadth frameworks.

Abstract

Deep iterative chain-of-thought (CoT) reasoning enables LLMs to tackle complex tasks by progressively activating relevant pre-trained knowledge. However, it faces challenges in ensuring continual improvement and determining a stopping criterion. In this paper, we investigate whether the relevant knowledge that contributes directly to solving the given question can be activated from the initial reasoning path, thus circumventing the need for iterative refinement. Our experiments reveal that increasing the diversity of initial reasoning paths can achieve comparable or superior performance, a concept we term \textit{breadth reasoning}. However, existing breadth reasoning approaches, such as self-consistency, offer limited diversity. To address this limitation, we propose a simple yet effective method that enhances reasoning breadth by integrating contextual exploration with reduced sampling randomness. Extensive experiments demonstrate that our approach significantly outperforms deep iterative reasoning. Our code is provided in https://github.com/zongqianwu/breadth.

Is Depth All You Need? An Exploration of Iterative Reasoning in LLMs

TL;DR

This work investigates whether breadth reasoning can substitute for deep iterative reasoning in chain-of-thought (CoT) prompting for large language models. Through theoretical analysis and large-scale experiments, the authors show that generating diverse initial reasoning paths and aggregating predictions—via contextual reformulation and self-consistency—can outperform deep iterative approaches, especially on symbolic and commonsense tasks, while maintaining competitive performance on arithmetic problems. They identify factors that influence reasoning diversity, quantify their impact with entropy analysis, and propose a practical method (QuestionC-SC and PromptC-SC) that extends reasoning breadth with reduced sampling randomness. The findings suggest that reasoning diversity is a viable, potentially more cost-effective alternative to iterative refinement, with implications for designing more robust CoT strategies and hybrid depth-breadth frameworks.

Abstract

Deep iterative chain-of-thought (CoT) reasoning enables LLMs to tackle complex tasks by progressively activating relevant pre-trained knowledge. However, it faces challenges in ensuring continual improvement and determining a stopping criterion. In this paper, we investigate whether the relevant knowledge that contributes directly to solving the given question can be activated from the initial reasoning path, thus circumventing the need for iterative refinement. Our experiments reveal that increasing the diversity of initial reasoning paths can achieve comparable or superior performance, a concept we term \textit{breadth reasoning}. However, existing breadth reasoning approaches, such as self-consistency, offer limited diversity. To address this limitation, we propose a simple yet effective method that enhances reasoning breadth by integrating contextual exploration with reduced sampling randomness. Extensive experiments demonstrate that our approach significantly outperforms deep iterative reasoning. Our code is provided in https://github.com/zongqianwu/breadth.

Paper Structure

This paper contains 25 sections, 4 equations, 7 figures, 12 tables, 1 algorithm.

Figures (7)

  • Figure 1: Deep Iterative Reasoning extends the CoT reasoning by iteratively refeeding both the reasoning steps and predictions into LLMs, while Breadth Reasoning involves generating diverse reasoning paths and aggregating corresponding multiple predictions.
  • Figure 2: The effectiveness of deep iterative reasoning across different task types. The upper part demonstrates how deep iterative reasoning improves performance on arithmetic tasks, which rely on logical reasoning. As iterations progress, prior logical knowledge is progressively activated. In contrast, the lower part shows limited improvements on commonsense tasks, which primarily depend on information retrieval. If LLMs have not encountered relevant commonsense knowledge during pre-training, deeper reasoning alone is unlikely to resolve these challenges.
  • Figure 3: Performance of the LLMs with the deep iterative reasoning mechanisms across varying iterations on arithmetic tasks (left) using the AQuA and AddSub datasets, and commonsense tasks (right) using the StrategyQA and CommonsenseQA datasets.
  • Figure 4: Performance of self-consistency on a subset of AddSub (left) and AQuA (right) under varying numbers of reasoning paths. This subset consists of samples that were originally misclassified by standard CoT but were correctly predicted through deep iterative reasoning.
  • Figure 5: In the upper part, we review the complete CoT reasoning process, where the given question is concatenated with the pre-defined prompt and then fed into LLMs for reasoning, ultimately producing the corresponding prediction. During this process, four factors that may contribute to diverse initial reasoning paths are identified as follows: Modify the expression of the given question ➀ or the pre-defined prompt ➁; Perturbations applied to LLMs themselves ➂; and Sample initial reasoning paths from LLMs ➃. The lower part further elaborates on how these factors contribute to the diversification of initial reasoning paths, leading to multiple predictions.
  • ...and 2 more figures