Table of Contents
Fetching ...

Bridge-RAG: An Abstract Bridge Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Zihang Li, Wenjun Liu, Yikun Zong, Jiawen Tao, Siying Dai, Songcheng Ren, Zirui Liu, Yanbing Jiang, Tong Yang

Abstract

As an important paradigm for enhancing the generation quality of Large Language Models (LLMs), retrieval-augmented generation (RAG) faces the two challenges regarding retrieval accuracy and computational efficiency. This paper presents a novel RAG framework called Bridge-RAG. To overcome the accuracy challenge, we introduce the concept of abstract to bridge query entities and document chunks, providing robust semantic understanding. We organize the abstracts into a tree structure and design a multi-level retrieval strategy to ensure the inclusion of sufficient contextual information. To overcome the efficiency challenge, we introduce the improved Cuckoo Filter, an efficient data structure supporting rapid membership queries and updates, to accelerate entity location during the retrieval process. We design a block linked list structure and an entity temperature-based sorting mechanism to improve efficiency from the aspects of spatial and temporal locality. Extensive experiments show that Bridge-RAG achieves around 15.65% accuracy improvement and reduces 10x to 500x retrieval time compared to other RAG frameworks.

Bridge-RAG: An Abstract Bridge Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Abstract

As an important paradigm for enhancing the generation quality of Large Language Models (LLMs), retrieval-augmented generation (RAG) faces the two challenges regarding retrieval accuracy and computational efficiency. This paper presents a novel RAG framework called Bridge-RAG. To overcome the accuracy challenge, we introduce the concept of abstract to bridge query entities and document chunks, providing robust semantic understanding. We organize the abstracts into a tree structure and design a multi-level retrieval strategy to ensure the inclusion of sufficient contextual information. To overcome the efficiency challenge, we introduce the improved Cuckoo Filter, an efficient data structure supporting rapid membership queries and updates, to accelerate entity location during the retrieval process. We design a block linked list structure and an entity temperature-based sorting mechanism to improve efficiency from the aspects of spatial and temporal locality. Extensive experiments show that Bridge-RAG achieves around 15.65% accuracy improvement and reduces 10x to 500x retrieval time compared to other RAG frameworks.

Paper Structure

This paper contains 28 sections, 5 equations, 5 figures, 1 table, 2 algorithms.

Figures (5)

  • Figure 1: The workflow of Bridge-RAG: entities are identified from the query, the Cuckoo Filter locates relevant abstracts via $O(1)$ lookup, hierarchical context enrichment retrieves parent and child abstracts along with their chunks, and the multi-level context is integrated into a comprehensive prompt for the LLM to generate context-aware responses.
  • Figure 2: The process of relationship extraction
  • Figure 3: Error relation examples.
  • Figure 4: The workflow of Bridge-RAG when query contains entity x. The abstract with high temperature will be placed ahead in the bucket. All abstract addresses in different trees (from which corresponding chunk addresses are derived) are linked by the block linked list.
  • Figure 5: We record the search time per round of query with different number of trees and abstracts. Each round represents a search in the abstract forest, and addresses of abstracts are inserted into the improved Cuckoo Filter before the first search is performed.