Table of Contents
Fetching ...

Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding

Weizhi Fei, Xueyan Niu, Guoqing Xie, Yanhua Zhang, Bo Bai, Lei Deng, Wei Han

TL;DR

This work reframes long-context reasoning as interactive knowledge editing, enabling LLMs with fixed context windows to perform multi-hop reasoning by planning sub-questions and retrieving relevant context chunks within a DAG-based framework. It introduces two algorithms—Iterative QA with fact extraction and Knowledge-constrained decoding—built on planning and retrieval modules to update the model’s reasoning without parameter updates. Across long-context QA benchmarks and a synthetic variable-tracking task, the proposed approach outperforms fixed-window baselines and competitive long-context methods, with the knowledge-constrained decoding variant delivering the strongest results. The method offers a practical, plug-and-play path to enhance long-context reasoning on commodity hardware, albeit with limitations tied to dataset scope and prompt generalization, and it highlights the conceptual link between long-context reasoning and knowledge editing.

Abstract

Current Large Language Models (LLMs) face inherent limitations due to their pre-defined context lengths, which impede their capacity for multi-hop reasoning within extensive textual contexts. While existing techniques like Retrieval-Augmented Generation (RAG) have attempted to bridge this gap by sourcing external information, they fall short when direct answers are not readily available. We introduce a novel approach that re-imagines information retrieval through dynamic in-context editing, inspired by recent breakthroughs in knowledge editing. By treating lengthy contexts as malleable external knowledge, our method interactively gathers and integrates relevant information, thereby enabling LLMs to perform sophisticated reasoning steps. Experimental results demonstrate that our method effectively empowers context-limited LLMs, such as Llama2, to engage in multi-hop reasoning with improved performance, which outperforms state-of-the-art context window extrapolation methods and even compares favorably to more advanced commercial long-context models. Our interactive method not only enhances reasoning capabilities but also mitigates the associated training and computational costs, making it a pragmatic solution for enhancing LLMs' reasoning within expansive contexts.

Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding

TL;DR

This work reframes long-context reasoning as interactive knowledge editing, enabling LLMs with fixed context windows to perform multi-hop reasoning by planning sub-questions and retrieving relevant context chunks within a DAG-based framework. It introduces two algorithms—Iterative QA with fact extraction and Knowledge-constrained decoding—built on planning and retrieval modules to update the model’s reasoning without parameter updates. Across long-context QA benchmarks and a synthetic variable-tracking task, the proposed approach outperforms fixed-window baselines and competitive long-context methods, with the knowledge-constrained decoding variant delivering the strongest results. The method offers a practical, plug-and-play path to enhance long-context reasoning on commodity hardware, albeit with limitations tied to dataset scope and prompt generalization, and it highlights the conceptual link between long-context reasoning and knowledge editing.

Abstract

Current Large Language Models (LLMs) face inherent limitations due to their pre-defined context lengths, which impede their capacity for multi-hop reasoning within extensive textual contexts. While existing techniques like Retrieval-Augmented Generation (RAG) have attempted to bridge this gap by sourcing external information, they fall short when direct answers are not readily available. We introduce a novel approach that re-imagines information retrieval through dynamic in-context editing, inspired by recent breakthroughs in knowledge editing. By treating lengthy contexts as malleable external knowledge, our method interactively gathers and integrates relevant information, thereby enabling LLMs to perform sophisticated reasoning steps. Experimental results demonstrate that our method effectively empowers context-limited LLMs, such as Llama2, to engage in multi-hop reasoning with improved performance, which outperforms state-of-the-art context window extrapolation methods and even compares favorably to more advanced commercial long-context models. Our interactive method not only enhances reasoning capabilities but also mitigates the associated training and computational costs, making it a pragmatic solution for enhancing LLMs' reasoning within expansive contexts.
Paper Structure (33 sections, 1 equation, 4 figures, 4 tables, 1 algorithm)

This paper contains 33 sections, 1 equation, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: An instance of complex reasoning that involves synthesizing details from various parts of the text. As input context, the text sequence of The Earthsea Cycle is provided to the model, and the task is to identify the uncle of the character Serriadh. Top-left: Using commercial LLM with longer context window, the final answer is coherent yet incorrect. Bottom-left: Using retrieval-augmented generation, the model is still unable to find the correct answer. Right: The proposed method interactively generates sub-questions and extracts relevant facts. These sub-questions and obtained facts are then used to plan subsequent steps. Given the input context's considerable length, the original text is segmented into manageable chunks. This allows the Large Language Model (LLM) to answer questions and retrieve information based on the most contextually relevant chunks.
  • Figure 2: The proposed methods interactively utilize two standard modules: planning (depicted in green) and retrieval (depicted in yellow). As shown in illustration (a), a chain graph is employed; however, any arbitrary reasoning graph can be applied in general.
  • Figure 3: Variable tracking accuracies using Llama2 and our method. Our method, which utilizes knowledge-constrained decoding, is represented by squares, while Llama2 is represented by circles. The settings using different number of chains/hops are indicated by the color. (a) Varying the number of chains with the number of hops fixed at 2. (b) Varying the number of hops with the number of chains fixed at 1. In all these settings, our methods maintain high accuracies.
  • Figure 4: The template prompts of the two algorithms on the question "What is the capital city of the country of citizenship of Ivanka Trump's spouse?". (a) The template prompt of Algorithm 1, Iterative QA With Fact Extraction (b) The template prompt of Algorithm 2, Knowledge constrained Decoding

Theorems & Definitions (1)

  • Definition 1