Table of Contents
Fetching ...

Early Risk Stratification of Dosing Errors in Clinical Trials Using Machine Learning

Félicien Hêche, Sohrab Ferdowsi, Anthony Yazdani, Sara Sansaloni-Pastor, Douglas Teodoro

TL;DR

A reproducible and scalable ML framework for early, trial-level risk stratification of CTs at risk of high dosing error rates is introduced, supporting proactive, risk-based quality management in clinical research.

Abstract

Objective: The objective of this study is to develop a machine learning (ML)-based framework for early risk stratification of clinical trials (CTs) according to their likelihood of exhibiting a high rate of dosing errors, using information available prior to trial initiation. Materials and Methods: We constructed a dataset from ClinicalTrials.gov comprising 42,112 CTs. Structured, semi-structured trial data, and unstructured protocol-related free-text data were extracted. CTs were assigned binary labels indicating elevated dosing error rate, derived from adverse event reports, MedDRA terminology, and Wilson confidence intervals. We evaluated an XGBoost model trained on structured features, a ClinicalModernBERT model using textual data, and a simple late-fusion model combining both modalities. Post-hoc probability calibration was applied to enable interpretable, trial-level risk stratification. Results: The late-fusion model achieved the highest AUC-ROC (0.862). Beyond discrimination, calibrated outputs enabled robust stratification of CTs into predefined risk categories. The proportion of trials labeled as having an excessively high dosing error rate increased monotonically across higher predicted risk groups and aligned with the corresponding predicted probability ranges. Discussion: These findings indicate that dosing error risk can be anticipated at the trial level using pre-initiation information. Probability calibration was essential for translating model outputs into reliable and interpretable risk categories, while simple multimodal integration yielded performance gains without requiring complex architectures. Conclusion: This study introduces a reproducible and scalable ML framework for early, trial-level risk stratification of CTs at risk of high dosing error rates, supporting proactive, risk-based quality management in clinical research.

Early Risk Stratification of Dosing Errors in Clinical Trials Using Machine Learning

TL;DR

A reproducible and scalable ML framework for early, trial-level risk stratification of CTs at risk of high dosing error rates is introduced, supporting proactive, risk-based quality management in clinical research.

Abstract

Objective: The objective of this study is to develop a machine learning (ML)-based framework for early risk stratification of clinical trials (CTs) according to their likelihood of exhibiting a high rate of dosing errors, using information available prior to trial initiation. Materials and Methods: We constructed a dataset from ClinicalTrials.gov comprising 42,112 CTs. Structured, semi-structured trial data, and unstructured protocol-related free-text data were extracted. CTs were assigned binary labels indicating elevated dosing error rate, derived from adverse event reports, MedDRA terminology, and Wilson confidence intervals. We evaluated an XGBoost model trained on structured features, a ClinicalModernBERT model using textual data, and a simple late-fusion model combining both modalities. Post-hoc probability calibration was applied to enable interpretable, trial-level risk stratification. Results: The late-fusion model achieved the highest AUC-ROC (0.862). Beyond discrimination, calibrated outputs enabled robust stratification of CTs into predefined risk categories. The proportion of trials labeled as having an excessively high dosing error rate increased monotonically across higher predicted risk groups and aligned with the corresponding predicted probability ranges. Discussion: These findings indicate that dosing error risk can be anticipated at the trial level using pre-initiation information. Probability calibration was essential for translating model outputs into reliable and interpretable risk categories, while simple multimodal integration yielded performance gains without requiring complex architectures. Conclusion: This study introduces a reproducible and scalable ML framework for early, trial-level risk stratification of CTs at risk of high dosing error rates, supporting proactive, risk-based quality management in clinical research.
Paper Structure (18 sections, 4 figures, 9 tables)

This paper contains 18 sections, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Overview of the dataset construction steps, from CTs selection to feature and label construction and dataset splitting. Created with https://BioRender.com.
  • Figure 2: Overview of the dosing-error labeling pipeline. MedDRA high-level group terms were selected and reviewed by clinical pharmacology experts, matched to reported adverse events, aggregated at the trial level, and converted into binary risk labels using a Wilson confidence interval–based threshold. Created with https://BioRender.com
  • Figure 3: Illustration of the bias introduced by splitting CTs chronologically according to initiation dates. Because only completed or terminated trials are included, trials with shorter durations are more likely to fall within later time windows, leading to their overrepresentation in the validation and test sets.
  • Figure 4: Comparison of distributional shifts induced by different temporal splitting strategies. Kernel density estimates of enrollment counts across the training, validation, and test sets are shown for each strategy; distributions are truncated at the 95th percentile for visualization purposes. The left panel shows splitting by trial initiation date, and the right panel shows splitting by trial completion date, which we adopt in this work.