Information Re-Organization Improves Reasoning in Large Language Models
Xiaoxia Cheng, Zeqi Tan, Wei Xue, Weiming Lu
TL;DR
This work tackles the challenge of context-aware multi-hop reasoning in large language models by shifting focus from solely improving reasoning steps to reorganizing contextual information. It introduces InfoRE, a two-step pipeline that first extracts explicit logical relationships from context into MindMap structures and then prunes noise via an RL-trained BERT model, producing a reorganized context for reasoning. Across multiple models (Llama2-70B, GPT-3.5, GPT-4) and tasks (claim verification, QA, and reading comprehension), InfoRE yields consistent zero-shot improvements, with sizable gains on complex, cross-document reasoning; GPT-4 often achieves the highest performance, and combining InfoRE with CoT provides complementary benefits. The results highlight the value of explicit context organization for robust reasoning and point to practical applications in domains requiring deep contextual understanding, while also acknowledging potential risks such as misinformation propagation and the need for scalable re-organization strategies.
Abstract
Improving the reasoning capabilities of large language models (LLMs) has attracted considerable interest. Recent approaches primarily focus on improving the reasoning process to yield a more precise final answer. However, in scenarios involving contextually aware reasoning, these methods neglect the importance of first identifying logical relationships from the context before proceeding with the reasoning. This oversight could lead to a superficial understanding and interaction with the context, potentially undermining the quality and reliability of the reasoning outcomes. In this paper, we propose an information re-organization (InfoRE) method before proceeding with the reasoning to enhance the reasoning ability of LLMs. Our re-organization method involves initially extracting logical relationships from the contextual content, such as documents or paragraphs, and subsequently pruning redundant content to minimize noise. Then, we utilize the re-organized information in the reasoning process. This enables LLMs to deeply understand the contextual content by clearly perceiving these logical relationships, while also ensuring high-quality responses by eliminating potential noise. To demonstrate the effectiveness of our approach in improving the reasoning ability, we conduct experiments using Llama2-70B, GPT-3.5, and GPT-4 on various contextually aware multi-hop reasoning tasks. Using only a zero-shot setting, our method achieves an average absolute improvement of 4% across all tasks, highlighting its potential to improve the reasoning performance of LLMs. Our source code is available at https://github.com/hustcxx/InfoRE.
