Table of Contents
Fetching ...

ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization

Yunxiao Shi, Xing Zi, Zijing Shi, Haimin Zhang, Qiang Wu, Min Xu

TL;DR

This work tackles persistent limitations of retrieval-augmented generation, including retrieval quality for complex queries, inefficient long-term knowledge reuse, and lack of user personalization. It proposes ERAGent, a modular framework with Enhanced Question Rewriter, Retrieval Trigger, Knowledge Filter, Personalized LLM Reader, and Experiential Learner, plus a Memory Knowledge Database for incremental learning. Across one-round open-domain QA, one-round multi-hop QA, and multi-session multi-turn QA tasks over six datasets, ERAGent demonstrates improved accuracy, efficiency, and personalization, with strong synergies among its components. The results suggest ERAGent as a practical, scalable advancement for real-world RAG-based AI assistants and a foundation for further enhancements in retrieval-driven QA systems.

Abstract

Retrieval-augmented generation (RAG) for language models significantly improves language understanding systems. The basic retrieval-then-read pipeline of response generation has evolved into a more extended process due to the integration of various components, sometimes even forming loop structures. Despite its advancements in improving response accuracy, challenges like poor retrieval quality for complex questions that require the search of multifaceted semantic information, inefficiencies in knowledge re-retrieval during long-term serving, and lack of personalized responses persist. Motivated by transcending these limitations, we introduce ERAGent, a cutting-edge framework that embodies an advancement in the RAG area. Our contribution is the introduction of the synergistically operated module: Enhanced Question Rewriter and Knowledge Filter, for better retrieval quality. Retrieval Trigger is incorporated to curtail extraneous external knowledge retrieval without sacrificing response quality. ERAGent also personalizes responses by incorporating a learned user profile. The efficiency and personalization characteristics of ERAGent are supported by the Experiential Learner module which makes the AI assistant being capable of expanding its knowledge and modeling user profile incrementally. Rigorous evaluations across six datasets and three question-answering tasks prove ERAGent's superior accuracy, efficiency, and personalization, emphasizing its potential to advance the RAG field and its applicability in practical systems.

ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization

TL;DR

This work tackles persistent limitations of retrieval-augmented generation, including retrieval quality for complex queries, inefficient long-term knowledge reuse, and lack of user personalization. It proposes ERAGent, a modular framework with Enhanced Question Rewriter, Retrieval Trigger, Knowledge Filter, Personalized LLM Reader, and Experiential Learner, plus a Memory Knowledge Database for incremental learning. Across one-round open-domain QA, one-round multi-hop QA, and multi-session multi-turn QA tasks over six datasets, ERAGent demonstrates improved accuracy, efficiency, and personalization, with strong synergies among its components. The results suggest ERAGent as a practical, scalable advancement for real-world RAG-based AI assistants and a foundation for further enhancements in retrieval-driven QA systems.

Abstract

Retrieval-augmented generation (RAG) for language models significantly improves language understanding systems. The basic retrieval-then-read pipeline of response generation has evolved into a more extended process due to the integration of various components, sometimes even forming loop structures. Despite its advancements in improving response accuracy, challenges like poor retrieval quality for complex questions that require the search of multifaceted semantic information, inefficiencies in knowledge re-retrieval during long-term serving, and lack of personalized responses persist. Motivated by transcending these limitations, we introduce ERAGent, a cutting-edge framework that embodies an advancement in the RAG area. Our contribution is the introduction of the synergistically operated module: Enhanced Question Rewriter and Knowledge Filter, for better retrieval quality. Retrieval Trigger is incorporated to curtail extraneous external knowledge retrieval without sacrificing response quality. ERAGent also personalizes responses by incorporating a learned user profile. The efficiency and personalization characteristics of ERAGent are supported by the Experiential Learner module which makes the AI assistant being capable of expanding its knowledge and modeling user profile incrementally. Rigorous evaluations across six datasets and three question-answering tasks prove ERAGent's superior accuracy, efficiency, and personalization, emphasizing its potential to advance the RAG field and its applicability in practical systems.
Paper Structure (28 sections, 3 equations, 11 figures, 2 tables)

This paper contains 28 sections, 3 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: The ERAGent Framework.
  • Figure 2: A case study on the application of the Enhanced Question Rewriter module in the clinical medicine area.
  • Figure 3: Two AI assistants: ERAGent without User Profile (Assistant A) and ERAGent with User Profile (Assistant B) response to a user who ask "Give me a dietary recommendation for building muscle". The User Profile is summarized from historical conversational sessions. GPT-4 is then presented with the context to determine which assistant answers better.
  • Figure 4: The results of pairwise comparisons between Assistant B and Assistant A's responses across all categories on the MSMTQA dataset.
  • Figure 5: Metrics about response efficiency and quality alongside with the similarity threshold $\tau$.
  • ...and 6 more figures