A Deep Subgrouping Framework for Precision Drug Repurposing via Emulating Clinical Trials on Real-world Patient Data
Seungyeon Lee, Ruoqi Liu, Feixiong Cheng, Ping Zhang
TL;DR
Drug repurposing studies using real-world data often overlook patient heterogeneity, potentially missing drugs effective only in subgroups. The authors present STEDR, a deep framework that jointly learns subgroup structure with treatment-effect estimation through dual-level attention for patient representation and a subgrouping network based on variational autoencoders. Applied to Alzheimer's disease with MarketScan MDCR data covering over 8 million patients and 1,134 candidate drugs, STEDR emulated 100 trials per drug and identified 14 repurposing candidates, including several with subgroup-specific benefits, outperforming baselines. The results demonstrate STEDR's ability to reveal precision drug repurposing opportunities and to characterize clinically relevant subgroups tied to AD risk factors, advancing personalized therapeutic strategies.
Abstract
Drug repurposing identifies new therapeutic uses for existing drugs, reducing the time and costs compared to traditional de novo drug discovery. Most existing drug repurposing studies using real-world patient data often treat the entire population as homogeneous, ignoring the heterogeneity of treatment responses across patient subgroups. This approach may overlook promising drugs that benefit specific subgroups but lack notable treatment effects across the entire population, potentially limiting the number of repurposable candidates identified. To address this, we introduce STEDR, a novel drug repurposing framework that integrates subgroup analysis with treatment effect estimation. Our approach first identifies repurposing candidates by emulating multiple clinical trials on real-world patient data and then characterizes patient subgroups by learning subgroup-specific treatment effects. We deploy \model to Alzheimer's Disease (AD), a condition with few approved drugs and known heterogeneity in treatment responses. We emulate trials for over one thousand medications on a large-scale real-world database covering over 8 million patients, identifying 14 drug candidates with beneficial effects to AD in characterized subgroups. Experiments demonstrate STEDR's superior capability in identifying repurposing candidates compared to existing approaches. Additionally, our method can characterize clinically relevant patient subgroups associated with important AD-related risk factors, paving the way for precision drug repurposing.
