MiranDa: Mimicking the Learning Processes of Human Doctors to Achieve Causal Inference for Medication Recommendation
Ziheng Wang, Xinhe Li, Haruki Momma, Ryoichi Nagatomi
TL;DR
MiranDa introduces a causal-inference framework that mimics physician learning by pairing supervised evidence-based training with gradient-space reinforcement learning guided by counterfactual outcomes expressed as $ELOS$ (estimated length of stay). By using two retrieval-based action-space expansions and a reward based on $ELOS$ differences, MiranDa refines medication recommendations within a hyperbolic geometric analysis of structured drug combinations. On MIMIC-III and MIMIC-IV datasets, MiranDa achieves near-identical real LOS for counterfactual evaluation and superior metrics (e.g., ROC AUC and PR AUC) while revealing procedure-specific attributes and sparser, more targeted medication regimens. This work advances a general, causal-inference–driven paradigm that can be applied to diverse medical tasks and beyond, though it notes computational overhead and the need for cautious interpretation in observational settings.
Abstract
To enhance therapeutic outcomes from a pharmacological perspective, we propose MiranDa, designed for medication recommendation, which is the first actionable model capable of providing the estimated length of stay in hospitals (ELOS) as counterfactual outcomes that guide clinical practice and model training. In detail, MiranDa emulates the educational trajectory of doctors through two gradient-scaling phases shifted by ELOS: an Evidence-based Training Phase that utilizes supervised learning and a Therapeutic Optimization Phase grounds in reinforcement learning within the gradient space, explores optimal medications by perturbations from ELOS. Evaluation of the Medical Information Mart for Intensive Care III dataset and IV dataset, showcased the superior results of our model across five metrics, particularly in reducing the ELOS. Surprisingly, our model provides structural attributes of medication combinations proved in hyperbolic space and advocated "procedure-specific" medication combinations. These findings posit that MiranDa enhanced medication efficacy. Notably, our paradigm can be applied to nearly all medical tasks and those with information to evaluate predicted outcomes. The source code of the MiranDa model is available at https://github.com/azusakou/MiranDa.
