Table of Contents
Fetching ...

Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation

Guanhua Chen, Wenhan Yu, Xiao Lu, Xiao Zhang, Erli Meng, Lei Sha

TL;DR

Knowledge-dense domains pose retrieval challenges for RAG due to missing domain-specific perspectives. The paper presents MVRAG, a framework that uses offline professional perspectives extraction via PCA/NMF, intention-aware query rewriting, and retrieval augmentation to produce multi-perspective evidence for generation. Across legal and medical tasks, MVRAG yields substantial gains in recall, precision, and complex inference tasks, while maintaining practical latency. The results demonstrate improved interpretability and reliability of RAG in knowledge-intensive fields and suggest broad applicability to other domains.

Abstract

While Retrieval-Augmented Generation (RAG) plays a crucial role in the application of Large Language Models (LLMs), existing retrieval methods in knowledge-dense domains like law and medicine still suffer from a lack of multi-perspective views, which are essential for improving interpretability and reliability. Previous research on multi-view retrieval often focused solely on different semantic forms of queries, neglecting the expression of specific domain knowledge perspectives. This paper introduces a novel multi-view RAG framework, MVRAG, tailored for knowledge-dense domains that utilizes intention-aware query rewriting from multiple domain viewpoints to enhance retrieval precision, thereby improving the effectiveness of the final inference. Experiments conducted on legal and medical case retrieval demonstrate significant improvements in recall and precision rates with our framework. Our multi-perspective retrieval approach unleashes the potential of multi-view information enhancing RAG tasks, accelerating the further application of LLMs in knowledge-intensive fields.

Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation

TL;DR

Knowledge-dense domains pose retrieval challenges for RAG due to missing domain-specific perspectives. The paper presents MVRAG, a framework that uses offline professional perspectives extraction via PCA/NMF, intention-aware query rewriting, and retrieval augmentation to produce multi-perspective evidence for generation. Across legal and medical tasks, MVRAG yields substantial gains in recall, precision, and complex inference tasks, while maintaining practical latency. The results demonstrate improved interpretability and reliability of RAG in knowledge-intensive fields and suggest broad applicability to other domains.

Abstract

While Retrieval-Augmented Generation (RAG) plays a crucial role in the application of Large Language Models (LLMs), existing retrieval methods in knowledge-dense domains like law and medicine still suffer from a lack of multi-perspective views, which are essential for improving interpretability and reliability. Previous research on multi-view retrieval often focused solely on different semantic forms of queries, neglecting the expression of specific domain knowledge perspectives. This paper introduces a novel multi-view RAG framework, MVRAG, tailored for knowledge-dense domains that utilizes intention-aware query rewriting from multiple domain viewpoints to enhance retrieval precision, thereby improving the effectiveness of the final inference. Experiments conducted on legal and medical case retrieval demonstrate significant improvements in recall and precision rates with our framework. Our multi-perspective retrieval approach unleashes the potential of multi-view information enhancing RAG tasks, accelerating the further application of LLMs in knowledge-intensive fields.
Paper Structure (29 sections, 10 equations, 6 figures, 6 tables)

This paper contains 29 sections, 10 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: A t-SNE visualization of retrieval results from different methods using a legal database with manually added category labels, displayed in different colors in the plot. The magnifying glass represents the query vector, while the circles represent the retrieval results.
  • Figure 2: A case study showcasing the effectiveness of a multi-view retrieval framework in accurately diagnosing Huntington’s disease. The model corrects an initial misdiagnosis of Vitamin B12 deficiency by refining the search criteria to focus on neurodegenerative symptoms and family medical history, thus demonstrating the importance of multi-view search strategies in medical diagnostics. The professional perspectives used in the framework were determined during the offline part, ensuring their domain-specific relevance. The detailed case study is provided in Section \ref{['app:medical case']}.
  • Figure 3: Framework of our Multi-View RAG System. This figure demonstrates the system's core processes: Professional Perspectives Extraction, Intention Recognition and Query Rewriting, and Retrieval Augmentation, emphasizing the multi-view insights approach for intention-aware query rewriting
  • Figure 4:
  • Figure 5: Ablation Study on the impact of perspective selection strategies in our framework on Medical and Legal datasets. In the legal domain, the chart shows Recall@5 and Recall@10 after excluding each perspective: Basic Fact, Focus of Dispute, Application of Law, Penalty, Criminal History. For the medical domain, it displays the effects of removing Medical History, Symptoms, Laboratory Data, Treatment Response, Lifestyle. Each bar indicates the performance impact versus the full baseline and direct retrieval.
  • ...and 1 more figures