Retrieval Augmented Thought Process for Private Data Handling in Healthcare

Thomas Pouplin; Hao Sun; Samuel Holt; Mihaela van der Schaar

Retrieval Augmented Thought Process for Private Data Handling in Healthcare

Thomas Pouplin, Hao Sun, Samuel Holt, Mihaela van der Schaar

TL;DR

This work addresses privacy and data-staleness barriers to deploying LLMs in healthcare by grounding reasoning in external documents through the Retrieval-Augmented Thought Process (RATP). RATP formalizes open-book QA as a multi-step decision problem and solves it with Monte-Carlo Tree Search, using either a model-based score estimator or a self-critic to evaluate thoughts, while keeping LLMs frozen to protect private data. Empirical results on private EMRQA and EhrQA datasets show sizable accuracy gains over in-context RAG, with additional improvements when incorporating retrieved documents and robust exploration of the thought space. The framework delivers transparent, step-by-step reasoning traces and is readily generalizable, offering a privacy-preserving path to clinically grounded AI assistance.

Abstract

Large Language Models (LLMs) have demonstrated the strong potential to assist both clinicians and the general public with their extensive medical knowledge. However, their application in healthcare is constrained due to concerns about the privacy of data used in training, which prevents the integration of private and personal information because of security and ethical issues. Moreover, if their capabilities can be enhanced with information retrieval to access up-to-date knowledge, the current integration of LLMs with Information retrieval lacks robustness to imperfect retrieval, which can hinder their effectiveness and even reduce overall performance. In this work, we address this challenge by introducing the Retrieval-Augmented Thought Process (RATP). Given access to external knowledge, RATP formulates the thought generation of LLMs as a multiple-step decision process. To optimise such a thought process, RATP leverages Monte-Carlo Tree Search and learns a proxy reward function that permits cost-efficient inference. On a private dataset of electronic medical records, deliberately excluded from any LLM training set, RATP achieves 35% additional accuracy compared to in-context retrieval-augmented generation for the question-answering task.

Retrieval Augmented Thought Process for Private Data Handling in Healthcare

TL;DR

Abstract

Paper Structure (32 sections, 5 equations, 16 figures, 13 tables, 5 algorithms)

This paper contains 32 sections, 5 equations, 16 figures, 13 tables, 5 algorithms.

Introduction
Preliminaries
Retrieval-Augmented Thought Process
Multi-Step Thought Generation as a Markov Decision Process
Planning with Monte Carlo Tree Search
Scoring models
Related Work
Experimental setup
Analysis
Benchmark
Conclusion
Large Language Applications in Healthcare enabled by RATP
Experimental details
Private knowledge: Unstructured Electronic Medical Records
Public Knowledge : BoolQ dataset
...and 17 more sections

Figures (16)

Figure 1: Retrieval-Augmented Thought Process overview.① The frozen LLM $l_{thought}$ given an answer $\hat{y}$ to the question $x$ by using the extra context $s_T$. ② The thought process starts from the question $x$ and outputs the best thought found $s_t$ to help answering $x$. The actions $\{a_i\}$ are decided by the MCTS with feedback from the scoring model. This component is detailed in Figure \ref{['fig:thought_process']}. ③ The information retrieval system interacts with the thought process by answering its queries with retrieved documents $\{I_i\}$.
Figure 2: Modeling the thought process. Each thought is generated from previous thoughts and/or documents, effectively creating a graph. The planning policy controlling the construction of this graph is detailed in Figure \ref{['fig:mcts_step']}.
Figure 3: One complete step from our MCTS decision process. It is divided into four functions, which are repeated until we find the answer or the thought process size limit is reached. The Selection, Expansion, Simulation, and Backpropagation functions are described in section \ref{['sec:mcts']}. Their associated algorithm can be found in Appendix \ref{['app:algs']}.
Figure 4: Evolution of the accuracy and the number of LLM queries. When we increase the thought process size (i.e. the number of thoughts generated), the accuracy increases but the number of LLM queries too.
Figure 5: Examples of Unstructured Electronic Medical Records. For privacy reasons, we present simulated EMRs resembling the actual dataset.
...and 11 more figures

Retrieval Augmented Thought Process for Private Data Handling in Healthcare

TL;DR

Abstract

Retrieval Augmented Thought Process for Private Data Handling in Healthcare

Authors

TL;DR

Abstract

Table of Contents

Figures (16)