Table of Contents
Fetching ...

RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering

Sichu Liang, Linhai Zhang, Hongyu Zhu, Wenwen Wang, Yulan He, Deyu Zhou

TL;DR

RGAR tackles a key limitation of medical retrieval-augmented generation by explicitly incorporating factual knowledge from EHRs alongside conceptual knowledge from medical corpora through a recurrence-based, dual-end retrieval framework. It decomposes the process into Conceptual Knowledge Retrieval (CKR) and Factual Knowledge Extraction (FKE), connected by a recurrence pipeline that iteratively refines queries and retrieved content, yielding a final set of chunks used for answer generation. Across three factual-aware benchmarks (MedQA-USMLE, MedMCQA, EHRNoteQA), RGAR achieves state-of-the-art results, with Llama-3.1-8B-Instruct plus RGAR outperforming larger models like GPT-3.5-turbo in several settings, underscoring the value of extracting factual knowledge to boost retrieval quality. The approach demonstrates significant practical potential for clinical decision support while acknowledging computational costs and ethical considerations around deploying retrieval-based medical QA systems.

Abstract

Medical question answering requires extensive access to specialized conceptual knowledge. The current paradigm, Retrieval-Augmented Generation (RAG), acquires expertise medical knowledge through large-scale corpus retrieval and uses this knowledge to guide a general-purpose large language model (LLM) for generating answers. However, existing retrieval approaches often overlook the importance of factual knowledge, which limits the relevance of retrieved conceptual knowledge and restricts its applicability in real-world scenarios, such as clinical decision-making based on Electronic Health Records (EHRs). This paper introduces RGAR, a recurrence generation-augmented retrieval framework that retrieves both relevant factual and conceptual knowledge from dual sources (i.e., EHRs and the corpus), allowing them to interact and refine each another. Through extensive evaluation across three factual-aware medical question answering benchmarks, RGAR establishes a new state-of-the-art performance among medical RAG systems. Notably, the Llama-3.1-8B-Instruct model with RGAR surpasses the considerably larger, RAG-enhanced GPT-3.5. Our findings demonstrate the benefit of extracting factual knowledge for retrieval, which consistently yields improved generation quality.

RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering

TL;DR

RGAR tackles a key limitation of medical retrieval-augmented generation by explicitly incorporating factual knowledge from EHRs alongside conceptual knowledge from medical corpora through a recurrence-based, dual-end retrieval framework. It decomposes the process into Conceptual Knowledge Retrieval (CKR) and Factual Knowledge Extraction (FKE), connected by a recurrence pipeline that iteratively refines queries and retrieved content, yielding a final set of chunks used for answer generation. Across three factual-aware benchmarks (MedQA-USMLE, MedMCQA, EHRNoteQA), RGAR achieves state-of-the-art results, with Llama-3.1-8B-Instruct plus RGAR outperforming larger models like GPT-3.5-turbo in several settings, underscoring the value of extracting factual knowledge to boost retrieval quality. The approach demonstrates significant practical potential for clinical decision support while acknowledging computational costs and ethical considerations around deploying retrieval-based medical QA systems.

Abstract

Medical question answering requires extensive access to specialized conceptual knowledge. The current paradigm, Retrieval-Augmented Generation (RAG), acquires expertise medical knowledge through large-scale corpus retrieval and uses this knowledge to guide a general-purpose large language model (LLM) for generating answers. However, existing retrieval approaches often overlook the importance of factual knowledge, which limits the relevance of retrieved conceptual knowledge and restricts its applicability in real-world scenarios, such as clinical decision-making based on Electronic Health Records (EHRs). This paper introduces RGAR, a recurrence generation-augmented retrieval framework that retrieves both relevant factual and conceptual knowledge from dual sources (i.e., EHRs and the corpus), allowing them to interact and refine each another. Through extensive evaluation across three factual-aware medical question answering benchmarks, RGAR establishes a new state-of-the-art performance among medical RAG systems. Notably, the Llama-3.1-8B-Instruct model with RGAR surpasses the considerably larger, RAG-enhanced GPT-3.5. Our findings demonstrate the benefit of extracting factual knowledge for retrieval, which consistently yields improved generation quality.

Paper Structure

This paper contains 28 sections, 7 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: a) Medical AI Systems from the Perspective of Bloom's Taxonomy. b) Two Types of Medical Question Answering Tasks.
  • Figure 2: The Overall Framework of RGAR. a) The Recurrence Pipeline in § \ref{['sec:pipeline']}; b) Conceptual Knowledge Retrieval in § \ref{['sec:Train-free']}; c) Factual Knowledge Extraction in § \ref{['sec:Extraction']}; d) Response Template in § \ref{['sec:pipeline']}.
  • Figure 3: Accuracy with Different Numbers of Retrieved Chunks on EHRNoteQA Dataset.
  • Figure 4: Fine-Grained Accuracy of EHRNoteQA After Sorting by Length and Dividing into Four Equal Parts.
  • Figure 5: t-SNE Visualization of Different Queries and the Retrieved Chunks.
  • ...and 1 more figures