Locally Interpretable Individualized Treatment Rules for Black-Box Decision Models
Yasin Khadem Charvadeh, Katherine S. Panageas, Yuan Chen
TL;DR
The paper tackles the problem of deriving individualized treatment rules that remain interpretable while leveraging the predictive power of black-box models. It introduces LI-ITR, which uses a $eta$-VAE to generate realistic latent-space perturbations and a hierarchical mixture of linear experts with a gating network to produce subject-specific local rules. Simulation results show LI-ITR accurately recovers local treatment-effect coefficients and achieves near-perfect accuracy in identifying optimal treatments, outperforming LIME and other baselines, even under misspecification. A real-data application to hepatotoxicity in breast-cancer endocrine therapy demonstrates LI-ITR’s practical utility, delivering interpretable, patient-specific recommendations with favorable policy values and strong alignment to the black-box predictor.
Abstract
Individualized treatment rules (ITRs) aim to optimize healthcare by tailoring treatment decisions to patient-specific characteristics. Existing methods typically rely on either interpretable but inflexible models or highly flexible black-box approaches that sacrifice interpretability; moreover, most impose a single global decision rule across patients. We introduce the Locally Interpretable Individualized Treatment Rule (LI-ITR) method, which combines flexible machine learning models to accurately learn complex treatment outcomes with locally interpretable approximations to construct subject-specific treatment rules. LI-ITR employs variational autoencoders to generate realistic local synthetic samples and learns individualized decision rules through a mixture of interpretable experts. Simulation studies show that LI-ITR accurately recovers true subject-specific local coefficients and optimal treatment strategies. An application to precision side-effect management in breast cancer illustrates the necessity of flexible predictive modeling and highlights the practical utility of LI-ITR in estimating optimal treatment rules while providing transparent, clinically interpretable explanations.
