Table of Contents
Fetching ...

A Survey on LLM-Assisted Clinical Trial Recruitment

Shrestha Ghosh, Moritz Schneider, Carina Reinicke, Carsten Eickhoff

TL;DR

This survey addresses the gap in analyzing how large language models can assist clinical trial recruitment by formalizing trial–patient matching, surveying data sources and public benchmarks, and categorizing LLM-based approaches across criterion-level, trial-level, and ranking tasks. It critiques existing evaluation frameworks, highlights the scarcity of standardized datasets, and discusses critical challenges such as data privacy, annotation scalability, and the reliability of LLM explanations in health contexts. By proposing an error taxonomy and outlining future directions like interactive trial design and explainable matches, the paper emphasizes both the promise and the cautions necessary for responsibly deploying LLMs in clinical research. The work provides a structured lens on benchmarks (N2C2, TREC CT) and methods, offering actionable guidance for researchers to develop interactive, transparent, and generalizable recruitment solutions with LLMs. Overall, the paper advances understanding of how to leverage LLMs for more efficient, scalable, and interpretable trial recruitment while acknowledging societal and data-sensitivity implications.

Abstract

Recent advances in LLMs have greatly improved general-domain NLP tasks. Yet, their adoption in critical domains, such as clinical trial recruitment, remains limited. As trials are designed in natural language and patient data is represented as both structured and unstructured text, the task of matching trials and patients benefits from knowledge aggregation and reasoning abilities of LLMs. Classical approaches are trial-specific and LLMs with their ability to consolidate distributed knowledge hold the potential to build a more general solution. Yet recent applications of LLM-assisted methods rely on proprietary models and weak evaluation benchmarks. In this survey, we are the first to analyze the task of trial-patient matching and contextualize emerging LLM-based approaches in clinical trial recruitment. We critically examine existing benchmarks, approaches and evaluation frameworks, the challenges to adopting LLM technologies in clinical research and exciting future directions.

A Survey on LLM-Assisted Clinical Trial Recruitment

TL;DR

This survey addresses the gap in analyzing how large language models can assist clinical trial recruitment by formalizing trial–patient matching, surveying data sources and public benchmarks, and categorizing LLM-based approaches across criterion-level, trial-level, and ranking tasks. It critiques existing evaluation frameworks, highlights the scarcity of standardized datasets, and discusses critical challenges such as data privacy, annotation scalability, and the reliability of LLM explanations in health contexts. By proposing an error taxonomy and outlining future directions like interactive trial design and explainable matches, the paper emphasizes both the promise and the cautions necessary for responsibly deploying LLMs in clinical research. The work provides a structured lens on benchmarks (N2C2, TREC CT) and methods, offering actionable guidance for researchers to develop interactive, transparent, and generalizable recruitment solutions with LLMs. Overall, the paper advances understanding of how to leverage LLMs for more efficient, scalable, and interpretable trial recruitment while acknowledging societal and data-sensitivity implications.

Abstract

Recent advances in LLMs have greatly improved general-domain NLP tasks. Yet, their adoption in critical domains, such as clinical trial recruitment, remains limited. As trials are designed in natural language and patient data is represented as both structured and unstructured text, the task of matching trials and patients benefits from knowledge aggregation and reasoning abilities of LLMs. Classical approaches are trial-specific and LLMs with their ability to consolidate distributed knowledge hold the potential to build a more general solution. Yet recent applications of LLM-assisted methods rely on proprietary models and weak evaluation benchmarks. In this survey, we are the first to analyze the task of trial-patient matching and contextualize emerging LLM-based approaches in clinical trial recruitment. We critically examine existing benchmarks, approaches and evaluation frameworks, the challenges to adopting LLM technologies in clinical research and exciting future directions.

Paper Structure

This paper contains 44 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Components in a patient recruitment process: conventional linear flow (in orange) vs. our proposed LLM-assisted interactive flow (in purple).
  • Figure 2: Taxonomy of errors in LLM-generations.