SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

Minda Hu; Licheng Zong; Hongru Wang; Jingyan Zhou; Jingjing Li; Yichen Gao; Kam-Fai Wong; Yu Li; Irwin King

SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

Minda Hu, Licheng Zong, Hongru Wang, Jingyan Zhou, Jingjing Li, Yichen Gao, Kam-Fai Wong, Yu Li, Irwin King

TL;DR

This work proposes a novel plug-and-play LLM-based retrieval method called SeRTS based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm that effectively adapts LLMs to document retrieval tasks, enhancing their ability to retrieve highly relevant documents for RAG in the context of medical knowledge queries.

Abstract

Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG). However, existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries, resulting in sub-optimal performance. To address these limitations, we propose a novel plug-and-play LLM-based retrieval method called Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm. By combining the reasoning capabilities of LLMs with the effectiveness of tree search, SeRTS boosts the zero-shot performance of retrieving high-quality and informative results for RAG. We further enhance retrieval performance by fine-tuning LLMs with Proximal Policy Optimization (PPO) objectives using the trajectories collected by SeRTS as feedback. Controlled experiments using the BioASQ-QA dataset with GPT-3.5-Turbo and LLama2-7b demonstrate that our method significantly improves the performance of the BM25 retriever and surpasses the strong baseline of self-reflection in both efficiency and scalability. Moreover, SeRTS generates higher-quality feedback for PPO training than self-reflection. Our proposed method effectively adapts LLMs to document retrieval tasks, enhancing their ability to retrieve highly relevant documents for RAG in the context of medical knowledge queries. This work presents a significant step forward in leveraging LLMs for accurate and comprehensive biomedical question answering.

SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

TL;DR

Abstract

Paper Structure (35 sections, 4 equations, 3 figures, 12 tables)

This paper contains 35 sections, 4 equations, 3 figures, 12 tables.

Introduction
Related Works
Medical RAG.
Applications of MCTS.
Learning from Feedback.
Method
Self-Rewarding Tree Search
Result Evaluator for Quantitatively Measuring
Query Proposer for Efficiently Searching
Monte Carlo Tree Search Process
Selection.
Expansion.
Evaluation.
Backpropagation.
Proximal Policy Optimization of LLMs with Feedback
...and 20 more sections

Figures (3)

Figure 1: A straightforward Retrieval-Augmented Generation (RAG) pipeline
Figure 2: SeRTS method overview: Query Proposer$P^{query}_{\theta}$ generates query $A_i$. $\mathcal{R}_{bm25}$ retrieves relevant documents $D_i$. Result Evaluator$P^{eval}_{\theta}$ assesses $D_i$, provides Reward $R_i$ and Feedback $F_i$. Observations $O_i$ (previous queries, retrieved documents, feedback) serve as input to $P^{query}_{\theta}$ for subsequent query proposals. The entire trajectory $\mathcal{T}_{SeRTS}$ (initial question $Q$, $A_{1..i}$, $O_{1..i}$, $R_{1..i}$) is used to fine-tune the two language model agents via PPO, to improve their performance in query proposal and evaluation.
Figure 3: Overview of the four operations in MCTS

SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

TL;DR

Abstract

SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)