CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models
Michael Reinisch, Jianfeng He, Chenxi Liao, Sauleh Ahmad Siddiqui, Bei Xiao
TL;DR
This work tackles the challenge of forecasting clinical trial phase transitions from protocol text, addressing high attrition and costly development. It introduces CTP-LLM, a GPT-3.5 Turbo–based model, and the PhaseTransition Dataset to automatically predict whether a trial advances to the next phase using only protocol documents. The results show that CTP-LLM achieves substantial predictive power, with an overall accuracy of $67\%$ across all phases and $75\%$ for Phase III→Approval, outperforming baselines like BERT+RF and Longformer. The study demonstrates the feasibility and value of LLM-powered CTOP, highlights critical trial-design features (e.g., Criteria, Brief), and provides a public dataset and benchmark to spur further research in protocol-driven outcome forecasting.
Abstract
New medical treatment development requires multiple phases of clinical trials. Despite the significant human and financial costs of bringing a drug to market, less than 20% of drugs in testing will make it from the first phase to final approval. Recent literature indicates that the design of the trial protocols significantly contributes to trial performance. We investigated Clinical Trial Outcome Prediction (CTOP) using trial design documents to predict phase transitions automatically. We propose CTP-LLM, the first Large Language Model (LLM) based model for CTOP. We also introduce the PhaseTransition (PT) Dataset; which labels trials based on their progression through the regulatory process and serves as a benchmark for CTOP evaluation. Our fine-tuned GPT-3.5-based model (CTP-LLM) predicts clinical trial phase transition by analyzing the trial's original protocol texts without requiring human-selected features. CTP-LLM achieves a 67% accuracy rate in predicting trial phase transitions across all phases and a 75% accuracy rate specifically in predicting the transition from Phase~III to final approval. Our experimental performance highlights the potential of LLM-powered applications in forecasting clinical trial outcomes and assessing trial design.
