Table of Contents
Fetching ...

DeepNote: Note-Centric Deep Retrieval-Augmented Generation

Ruobing Wang, Qingfei Zhao, Yukun Yan, Daren Zha, Yuxuan Chen, Shi Yu, Zhenghao Liu, Yixuan Wang, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun

TL;DR

DeepNote is developed, an adaptive RAG framework that achieves in-depth and robust exploration of knowledge sources through note-centric adaptive retrieval and exhibits the ability to gather knowledge with both high density and quality.

Abstract

Retrieval-Augmented Generation (RAG) mitigates factual errors and hallucinations in Large Language Models (LLMs) for question-answering (QA) by incorporating external knowledge. However, existing adaptive RAG methods rely on LLMs to predict retrieval timing and directly use retrieved information for generation, often failing to reflect real information needs and fully leverage retrieved knowledge. We develop DeepNote, an adaptive RAG framework that achieves in-depth and robust exploration of knowledge sources through note-centric adaptive retrieval. DeepNote employs notes as carriers for refining and accumulating knowledge. During in-depth exploration, it uses these notes to determine retrieval timing, formulate retrieval queries, and iteratively assess knowledge growth, ultimately leveraging the best note for answer generation. Extensive experiments and analyses demonstrate that DeepNote significantly outperforms all baselines (+10.2% to +20.1%) and exhibits the ability to gather knowledge with both high density and quality. Additionally, DPO further improves the performance of DeepNote. The code and data are available at https://github.com/thunlp/DeepNote.

DeepNote: Note-Centric Deep Retrieval-Augmented Generation

TL;DR

DeepNote is developed, an adaptive RAG framework that achieves in-depth and robust exploration of knowledge sources through note-centric adaptive retrieval and exhibits the ability to gather knowledge with both high density and quality.

Abstract

Retrieval-Augmented Generation (RAG) mitigates factual errors and hallucinations in Large Language Models (LLMs) for question-answering (QA) by incorporating external knowledge. However, existing adaptive RAG methods rely on LLMs to predict retrieval timing and directly use retrieved information for generation, often failing to reflect real information needs and fully leverage retrieved knowledge. We develop DeepNote, an adaptive RAG framework that achieves in-depth and robust exploration of knowledge sources through note-centric adaptive retrieval. DeepNote employs notes as carriers for refining and accumulating knowledge. During in-depth exploration, it uses these notes to determine retrieval timing, formulate retrieval queries, and iteratively assess knowledge growth, ultimately leveraging the best note for answer generation. Extensive experiments and analyses demonstrate that DeepNote significantly outperforms all baselines (+10.2% to +20.1%) and exhibits the ability to gather knowledge with both high density and quality. Additionally, DPO further improves the performance of DeepNote. The code and data are available at https://github.com/thunlp/DeepNote.

Paper Structure

This paper contains 37 sections, 6 equations, 6 figures, 21 tables.

Figures (6)

  • Figure 1: Illustration of DeepNote. DeepNote fully integrates knowledge retrieved across multiple iterations using notes as the knowledge carrier and employs the best note to formulate retrieval decisions.
  • Figure 2: Overview of DeepNote. DeepNote consists of three processes: Note Initialization, Note-Centric Adaptive Retrieval, and Note-Informed Answer Generation. We employ a note-centric strategy to formulate retrieval decisions (including "when and what to retrieve"), accumulate knowledge, and generate answers.
  • Figure 3: Knowledge Density Comparision on Llama3.1-70B-Instruct. The "Init Note" means that the initial note. We calculated the arithmetic mean of token length, density, and performance.
  • Figure 4: Performance on different adaptive hyper-parameters with Llama3.1-70B-Instruct.
  • Figure 5: Retrieval efficiency on different adaptive hyper-parameters with Llama3.1-70B-Instruct.
  • ...and 1 more figures