Table of Contents
Fetching ...

Zero-Shot Clinical Trial Patient Matching with LLMs

Michael Wornow, Alejandro Lozano, Dev Dash, Jenelle Jindal, Kenneth W. Mahaffey, Nigam H. Shah

TL;DR

This work demonstrates that zero-shot large language models can effectively match patients to clinical trial inclusion criteria using unstructured EHR notes, achieving state-of-the-art performance on the 2018 n2c2 cohort dataset without fine-tuning. It introduces four prompting strategies (ACAN, ACIN, ICAN, ICIN) and a two-stage retrieval pipeline that pre-filters notes with small embedding models to reduce token usage, while preserving high accuracy. The study also assesses interpretability by having clinicians evaluate natural language rationales accompanying each decision, finding high coherence for correct predictions and substantial but lower alignment for incorrect ones. Collectively, the results indicate LLMs can accelerate clinical trial screening at reduced data and monetary costs, with human-in-the-loop validation offering a practical deployment path. The work also discusses limitations and future directions for real-world scale, generalization across trials, and privacy considerations in health systems.

Abstract

Matching patients to clinical trials is a key unsolved challenge in bringing new drugs to market. Today, identifying patients who meet a trial's eligibility criteria is highly manual, taking up to 1 hour per patient. Automated screening is challenging, however, as it requires understanding unstructured clinical text. Large language models (LLMs) offer a promising solution. In this work, we explore their application to trial matching. First, we design an LLM-based system which, given a patient's medical history as unstructured clinical text, evaluates whether that patient meets a set of inclusion criteria (also specified as free text). Our zero-shot system achieves state-of-the-art scores on the n2c2 2018 cohort selection benchmark. Second, we improve the data and cost efficiency of our method by identifying a prompting strategy which matches patients an order of magnitude faster and more cheaply than the status quo, and develop a two-stage retrieval pipeline that reduces the number of tokens processed by up to a third while retaining high performance. Third, we evaluate the interpretability of our system by having clinicians evaluate the natural language justifications generated by the LLM for each eligibility decision, and show that it can output coherent explanations for 97% of its correct decisions and 75% of its incorrect ones. Our results establish the feasibility of using LLMs to accelerate clinical trial operations.

Zero-Shot Clinical Trial Patient Matching with LLMs

TL;DR

This work demonstrates that zero-shot large language models can effectively match patients to clinical trial inclusion criteria using unstructured EHR notes, achieving state-of-the-art performance on the 2018 n2c2 cohort dataset without fine-tuning. It introduces four prompting strategies (ACAN, ACIN, ICAN, ICIN) and a two-stage retrieval pipeline that pre-filters notes with small embedding models to reduce token usage, while preserving high accuracy. The study also assesses interpretability by having clinicians evaluate natural language rationales accompanying each decision, finding high coherence for correct predictions and substantial but lower alignment for incorrect ones. Collectively, the results indicate LLMs can accelerate clinical trial screening at reduced data and monetary costs, with human-in-the-loop validation offering a practical deployment path. The work also discusses limitations and future directions for real-world scale, generalization across trials, and privacy considerations in health systems.

Abstract

Matching patients to clinical trials is a key unsolved challenge in bringing new drugs to market. Today, identifying patients who meet a trial's eligibility criteria is highly manual, taking up to 1 hour per patient. Automated screening is challenging, however, as it requires understanding unstructured clinical text. Large language models (LLMs) offer a promising solution. In this work, we explore their application to trial matching. First, we design an LLM-based system which, given a patient's medical history as unstructured clinical text, evaluates whether that patient meets a set of inclusion criteria (also specified as free text). Our zero-shot system achieves state-of-the-art scores on the n2c2 2018 cohort selection benchmark. Second, we improve the data and cost efficiency of our method by identifying a prompting strategy which matches patients an order of magnitude faster and more cheaply than the status quo, and develop a two-stage retrieval pipeline that reduces the number of tokens processed by up to a third while retaining high performance. Third, we evaluate the interpretability of our system by having clinicians evaluate the natural language justifications generated by the LLM for each eligibility decision, and show that it can output coherent explanations for 97% of its correct decisions and 75% of its incorrect ones. Our results establish the feasibility of using LLMs to accelerate clinical trial operations.
Paper Structure (24 sections, 1 equation, 7 figures, 9 tables)

This paper contains 24 sections, 1 equation, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Zero-Shot system design
  • Figure 2: Two-stage retrieval pipeline performance
  • Figure 3: Clinician assessment of LLM-generated rationales
  • Figure S1: Criterion-level confusion matrices
  • Figure S2: Two-Stage Retrieval with $ACIN$ strategy
  • ...and 2 more figures