Clinical Trials Protocol Authoring using LLMs

Morteza Maleki; SeyedAli Ghahari

Clinical Trials Protocol Authoring using LLMs

Morteza Maleki, SeyedAli Ghahari

TL;DR

The paper tackles the time-consuming process of clinical trial protocol authoring by leveraging GPT-4-based generative AI and a metadata-driven data pipeline to automate protocol sections. It evaluates both traditional LLMs and OpenAI GPT models, with prompt engineering proving crucial for long-form content quality. Key contributions include a data-collection and preprocessing framework combining drug- and study-level metadata, a comparative model study across GPT-3.5 and GPT-4 variants, and a cost-performance analysis that informs model selection for scalable use. The work demonstrates that AI-generated protocol sections can expedite design while maintaining coherence and regulatory relevance, and it outlines practical directions for integrating AI into future trial-design workflows. Overall, the findings provide a foundation for broader adoption and further innovation in AI-assisted clinical trial design.

Abstract

This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies. With a focus on leveraging the capabilities of generative AI, specifically GPT-4, this initiative aimed to streamline and enhance the efficiency and accuracy of clinical trial protocols. The methodology encompassed a detailed analysis and preparation of comprehensive drug and study level metadata, followed by the deployment of GPT-4 for automated protocol section generation. Results demonstrated a significant improvement in protocol authoring, highlighted by increases in efficiency, accuracy, and the customization of protocols to specific trial requirements. Challenges encountered during model selection and prompt engineering were systematically addressed, leading to refined methodologies that capitalized on the advanced text generation capabilities of GPT-4. This project not only showcases the practical applications and benefits of generative AI in clinical trial design but also sets a foundation for future innovations in the field.

Clinical Trials Protocol Authoring using LLMs

TL;DR

Abstract

Paper Structure (25 sections, 9 figures, 3 tables)

This paper contains 25 sections, 9 figures, 3 tables.

Introduction
Clinical Trials
Study Design in Clinical Trials
Challenges and Pain Points in Study Design
Literature Review
Data Sources and Analysis
Drug level metadata
Study level metadata
Data processing and Preparation
Data Availability
Model Development and Evaluation
LLM Models Training
OpenAI GPT Models
Prompt Engineering
Results
...and 10 more sections

Figures (9)

Figure 1: Aggregated (across all number of examples) metrics across all models.
Figure 2: Metric comparison for GPT-4o model with varying number of examples (i.e. 0, 1, 2, 3).
Figure 3: Evaluation Metrics for all GPT Models and a variation number of examples provided in the prompt
Figure 4: Forecast Cost Analysis for GPT Models with Varying Number of Examples.
Figure 5: Metric comparison for GPT-3.5-Turbo model with varying number of examples (i.e. 0, 1, 2, 3).
...and 4 more figures

Clinical Trials Protocol Authoring using LLMs

TL;DR

Abstract

Clinical Trials Protocol Authoring using LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)