ClinicalTrialsHub: Bridging Registries and Literature for Comprehensive Clinical Trial Access
Jiwoo Park, Ruoqi Liu, Avani Jagdale, Andrew Srisuwananukorn, Jing Zhao, Lang Li, Ping Zhang, Sachin Kumar
TL;DR
ClinicalTrialsHub addresses the fragmentation between ClinicalTrials.gov and PubMed by unifying registry data with literature-derived content. It employs LLM-based extraction to convert PubMed full-text into a CTG-like structured schema and provides an attribution-grounded QA capability, validated through automatic extraction benchmarks and a user study with medical professionals. The system demonstrates a substantial expansion in searchable structured trial information (83.8%) and improves search relevance via BM25-based reranking while delivering evidence-grounded answers. These capabilities promise faster, more reliable access to comprehensive clinical-trial evidence for patients, clinicians, and researchers, with planned enhancements including domain-specific fine-tuning and larger-scale usability evaluations.
Abstract
We present ClinicalTrialsHub, an interactive search-focused platform that consolidates all data from ClinicalTrials.gov and augments it by automatically extracting and structuring trial-relevant information from PubMed research articles. Our system effectively increases access to structured clinical trial data by 83.8% compared to relying on ClinicalTrials.gov alone, with potential to make access easier for patients, clinicians, researchers, and policymakers, advancing evidence-based medicine. ClinicalTrialsHub uses large language models such as GPT-5.1 and Gemini-3-Pro to enhance accessibility. The platform automatically parses full-text research articles to extract structured trial information, translates user queries into structured database searches, and provides an attributed question-answering system that generates evidence-grounded answers linked to specific source sentences. We demonstrate its utility through a user study involving clinicians, clinical researchers, and PhD students of pharmaceutical sciences and nursing, and a systematic automatic evaluation of its information extraction and question answering capabilities.
