Table of Contents
Fetching ...

Stepwise Informativeness Search for Efficient and Effective LLM Reasoning

Siyuan Wang, Enda Zhao, Zhongyu Wei, Xiang Ren

TL;DR

This work tackles the problem of LLMs producing unreliable and redundant rationales during long multi-step reasoning by identifying and leveraging underutilized prior steps. It introduces stepwise informativeness search, a stepwise beam search framework with grounding-guided and novelty-guided selection, plus a self-grounding strategy that prompts explicit grounding before each deduction. The approach reduces redundancy and improves accuracy across four datasets and multiple model families, while also reducing token costs compared to traditional beam-search baselines. The findings suggest a domain-agnostic mechanism to improve rationale quality and efficiency without relying on task-specific reward models, broadening practical applicability of deep reasoning in LLMs.

Abstract

Advances in Large Language Models (LLMs) have significantly improved multi-step reasoning through generating free-text rationales. However, recent studies show that LLMs tend to lose focus over the middle of long contexts. This raises concerns that as reasoning progresses, LLMs may overlook information in earlier steps when decoding subsequent steps, leading to generate unreliable and redundant rationales. To address this, we propose guiding LLMs to generate more accurate and concise step-by-step rationales by (1) proactively referencing information from underutilized prior steps, and (2) minimizing redundant information between new and existing steps. We introduce stepwise informativeness search, an inference-time tree search framework incorporating two selection heuristics: grounding-guided selection which prioritizes steps paying higher attention over underutilized steps; and novelty-guided selection which encourages steps with novel conclusions. During rationale generation, we use a self-grounding strategy that prompts LLMs to explicitly reference relevant prior steps to provide premises before deduction at each step. Experimental results on four reasoning datasets demonstrate that our approach improves reasoning accuracy by generating higher-quality rationales with reduced errors and redundancy.

Stepwise Informativeness Search for Efficient and Effective LLM Reasoning

TL;DR

This work tackles the problem of LLMs producing unreliable and redundant rationales during long multi-step reasoning by identifying and leveraging underutilized prior steps. It introduces stepwise informativeness search, a stepwise beam search framework with grounding-guided and novelty-guided selection, plus a self-grounding strategy that prompts explicit grounding before each deduction. The approach reduces redundancy and improves accuracy across four datasets and multiple model families, while also reducing token costs compared to traditional beam-search baselines. The findings suggest a domain-agnostic mechanism to improve rationale quality and efficiency without relying on task-specific reward models, broadening practical applicability of deep reasoning in LLMs.

Abstract

Advances in Large Language Models (LLMs) have significantly improved multi-step reasoning through generating free-text rationales. However, recent studies show that LLMs tend to lose focus over the middle of long contexts. This raises concerns that as reasoning progresses, LLMs may overlook information in earlier steps when decoding subsequent steps, leading to generate unreliable and redundant rationales. To address this, we propose guiding LLMs to generate more accurate and concise step-by-step rationales by (1) proactively referencing information from underutilized prior steps, and (2) minimizing redundant information between new and existing steps. We introduce stepwise informativeness search, an inference-time tree search framework incorporating two selection heuristics: grounding-guided selection which prioritizes steps paying higher attention over underutilized steps; and novelty-guided selection which encourages steps with novel conclusions. During rationale generation, we use a self-grounding strategy that prompts LLMs to explicitly reference relevant prior steps to provide premises before deduction at each step. Experimental results on four reasoning datasets demonstrate that our approach improves reasoning accuracy by generating higher-quality rationales with reduced errors and redundancy.

Paper Structure

This paper contains 27 sections, 7 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: An example illustrating LLMs' difficulty in referencing early-step information (e.g., underutilization of [Step-2,4,5,6]), and the inclusion of redundant steps (e.g., repeated conclusions in [Step-5, 7]). The rightward red arrow indicates the focus is on generating [Step-8] with [Step 1-7] have been generated.
  • Figure 2: Upper: Overview of our informativeness search framework, illustrated with beam size of 1. Green diagonal-striped blocks represent selected steps while gray blocks are discarded. Cross marks indicate incorrect deductions, and the orange crosshatched block highlights a redundant step that may lead to errors. Italics illustrate our self-grounding strategy. Bottom: While previous methods would accept this redundant [Step-7] as logically valid, our framework filters it out based on its low novelty and poor grounding on underutilized steps.
  • Figure 3: Accuracy and average token count (Avg. # Tokens) of final predicted rationales using different methods on Llama3.2-3B-Instruct.
  • Figure 4: Total token costs ($\times k$ tokens) of different stepwise beam search methods. Baseline refers to stepwise beam search using only cumulative likelihood scoring.
  • Figure 5: Average count of redundant steps whose conclusions have over 70% tri-word overlap with any previous conclusions in the same rationale.
  • ...and 3 more figures