POLYRAG: Integrating Polyviews into Retrieval-Augmented Generation for Medical Applications
Chunjing Gan, Dan Yang, Binbin Hu, Ziqi Liu, Yue Shen, Zhiqiang Zhang, Jian Wang, Jun Zhou
TL;DR
This work introduces PolyRAG, a retrieval-augmented generation framework for medical applications that integrates multiple retrieval perspectives (polyviews) to better handle timeliness, authoritativeness, and knowledge diversity. Documents are evaluated along six polyviews and combined via a multi-rewards view-mixture to produce top results, which then drive polyview-grounded generation. To address evaluation gaps, the authors propose PolyEval, a real-world medical benchmark with diverse domains, query intents, and annotated attributes. Experiments demonstrate that PolyRAG improves both retrieval and generation performance, with feasible latency for practical deployment. The approach offers a principled way to reduce hallucination and improve up-to-date, authoritative medical information in RAG systems, with potential application beyond medicine.
Abstract
Large language models (LLMs) have become a disruptive force in the industry, introducing unprecedented capabilities in natural language processing, logical reasoning and so on. However, the challenges of knowledge updates and hallucination issues have limited the application of LLMs in medical scenarios, where retrieval-augmented generation (RAG) can offer significant assistance. Nevertheless, existing retrieve-then-read approaches generally digest the retrieved documents, without considering the timeliness, authoritativeness and commonality of retrieval. We argue that these approaches can be suboptimal, especially in real-world applications where information from different sources might conflict with each other and even information from the same source in different time scale might be different, and totally relying on this would deteriorate the performance of RAG approaches. We propose PolyRAG that carefully incorporate judges from different perspectives and finally integrate the polyviews for retrieval augmented generation in medical applications. Due to the scarcity of real-world benchmarks for evaluation, to bridge the gap we propose PolyEVAL, a benchmark consists of queries and documents collected from real-world medical scenarios (including medical policy, hospital & doctor inquiry and healthcare) with multiple tagging (e.g., timeliness, authoritativeness) on them. Extensive experiments and analysis on PolyEVAL have demonstrated the superiority of PolyRAG.
